CMPUT680 - Winter 2001

Similar documents
Solving Problems by Searching chap3 1. Problem-Solving Agents

Problem Solving as Search - I

CENG 466 Artificial Intelligence. Lecture 4 Solving Problems by Searching (II)

Aryeh Rappaport Avinoam Meir. Schedule automation

An Efficient Code Compression Technique using Application-Aware Bitmask and Dictionary Selection Methods

Uninformed search methods

SEARCH TREE. Generating the children of a node

CSE 3401: Intro to AI & LP Uninformed Search II

ISyE 3103 Introduction to Supply Chain Modeling: Logistics Summer 2012 Homework 8 Issued: July 11, 2012 Due: July 18, 2012

Representation. Representation. Representation. Representation. 8 puzzle.

SEARCH SEARCH TREE. Node: State in state tree. Root node: Top of state tree

Uninformed search methods II.

Search I. Tuomas Sandholm Carnegie Mellon University Computer Science Department. [Read Russell & Norvig Chapter 3]

Autodesk Moldflow Communicator Process settings

Titolo presentazione sottotitolo

The Evolution of Transport Planning

CT PET-2018 Part - B Phd COMPUTER APPLICATION Sample Question Paper

ENHANCED PARKWAY STUDY: PHASE 2 CONTINUOUS FLOW INTERSECTIONS. Final Report

Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) Computer Science Department

Uninformed search methods II.

CS 4649/7649 Robot Intelligence: Planning

Uninformed search methods

SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY :: PUTTUR (AUTONOMOUS) Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE)

Problem Solving Agents

i) Linear programming

User Help. Fabasoft Scrum

TIMETABLING IN SPORTS AND ENTERTAINMENT

Optimizing Cyclist Parking in a Closed System

Game Theory (MBA 217) Final Paper. Chow Heavy Industries Ty Chow Kenny Miller Simiso Nzima Scott Winder

The system design must obey these constraints. The system is to have the minimum cost (capital plus operating) while meeting the constraints.

A Novel Decode-Aware Compression Technique for Improved Compression and Decompression

Repetition Structures

A IMPROVED VOGEL S APPROXIMATIO METHOD FOR THE TRA SPORTATIO PROBLEM. Serdar Korukoğlu 1 and Serkan Ballı 2.

Uninformed Search (Ch )

The Incremental Evolution of Gaits for Hexapod Robots

CS 351 Design of Large Programs Zombie House

Self-Organizing Signals: A Better Framework for Transit Signal Priority

MEMORY is one of the key driving factors in embeddedsystem

GRAPH-BASED SPORTS RANKINGS

CSE 3402: Intro to Artificial Intelligence Uninformed Search II

Warm Up: Thursday, 11/5/2015

Abstract Currently there is a growing interest in the evolutionary algorithm paradigm, as it promises a robust and general search technique. Still, in

A chartered bus allocation problem

CSC384: Introduction to Artificial Intelligence. Search

METHODOLOGY. Signalized Intersection Average Control Delay (sec/veh)

Problem-Solving: A* and Beyond. CSCI 3202, Fall 2010

Warm Up: Thursday, 11/5/2015

EE 364B: Wind Farm Layout Optimization via Sequential Convex Programming

Real World Search Problems. CS 331: Artificial Intelligence Uninformed Search. Simpler Search Problems. Example: Oregon. Search Problem Formulation

Better Search Improved Uninformed Search CIS 32

Ocean Fishing Fleet Scheduling Path Optimization Model Research. Based On Improved Ant Colony Algorithm

A Hybrid Code Compression Technique using Bitmask and Prefix Encoding with Enhanced Dictionary Selection

CHAPTER 1 INTRODUCTION TO RELIABILITY

Transposition Table, History Heuristic, and other Search Enhancements

ECE 757. Homework #2

knn & Naïve Bayes Hongning Wang

COMP219: Artificial Intelligence. Lecture 8: Combining Search Strategies and Speeding Up

Chapter 5 5. INTERSECTIONS 5.1. INTRODUCTION

Post-Placement Functional Decomposition for FPGAs

Solving MINLPs with BARON. Mustafa Kılınç & Nick Sahinidis Department of Chemical Engineering Carnegie Mellon University

Quadratic Probing. Hash Tables (continued) & Disjoint Sets. Quadratic Probing. Quadratic Probing Example insert(40) 40%7 = 5

UNIT V 1. What are the traffic management measures? [N/D-13] 2. What is Transportation System Management (TSM)? [N/D-14]

[CROSS COUNTRY SCORING]

Uninformed Search (Ch )

TSP at isolated intersections: Some advances under simulation environment

DESIGN AND ANALYSIS OF ALGORITHMS (DAA 2017)

Profile-driven Selective Code Compression

Topics Ch 3 Test A. A) 14 minutes B) 15 minutes C) 16 minutes minutes

CS 4649/7649 Robot Intelligence: Planning

A COURSE OUTLINE (September 2001)

CS 4649/7649 Robot Intelligence: Planning

Explanation of Assignment Algorithm

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

/435 Artificial Intelligence Fall 2015

Parsimonious Linear Fingerprinting for Time Series

Product Decomposition in Supply Chain Planning

The Orienteering Problem

GOLOMB Compression Technique For FPGA Configuration

1 PIPESYS Application

CS 221 PROJECT FINAL

Uninformed Search Strategies

INTEGRITY MANAGEMENT OF STRESS CORROSION CRACKING IN GAS PIPELINE HIGH CONSEQUENCE AREAS

A SEMI-PRESSURE-DRIVEN APPROACH TO RELIABILITY ASSESSMENT OF WATER DISTRIBUTION NETWORKS

International Journal of Engineering Trends and Technology (IJETT) Volume 18 Number2- Dec 2014

ORF 201 Computer Methods in Problem Solving. Final Project: Dynamic Programming Optimal Sailing Strategies

Operational Comparison of Transit Signal Priority Strategies

[CROSS COUNTRY SCORING]

Efficient Placement of Compressed Code for Parallel Decompression

SIDRA INTERSECTION 6.1 UPDATE HISTORY

Dynamic configuration of QC allocating problem based on multi-objective genetic algorithm

Traffic circles. February 9, 2009

Organizing Quantitative Data

Planning. CS 510: Intro to AI October 19 th 2017

Appendix E-2: Microsimulation Report

Urban OR: Quiz 2 Solutions (2003) ( 1 ρ 1 )( 1 ρ 1 ρ 2 ) ( 1 12 )( ) σ S ] 24 [ 2 = 60, 2 2 ] ( 2 ) 3

Out-of-Core Cholesky Factorization Algorithm on GPU and the Intel MIC Co-processors

Three New Methods to Find Initial Basic Feasible. Solution of Transportation Problems

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001

A Hare-Lynx Simulation Model

Classroom Tips and Techniques: The Partial-Fraction Decomposition. Robert J. Lopez Emeritus Professor of Mathematics and Maple Fellow Maplesoft

Transcription:

CMPUT680 - Winter 2001 Topic 6: Register Allocation and Instruction Scheduling José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680 CMPUT 680 - Compiler Design and Optimization 1

Reading List Tiger book: chapter 10 and 11 Dragon book: chapter 10 Other papers as assigned in class or homeworks CMPUT 680 - Compiler Design and Optimization 2

Register Allocation Motivation Live ranges and interference graphs Problem formulation Solution methods CMPUT 680 - Compiler Design and Optimization 3

Goals of Optimized Register Allocation To assign registers to variables that are more profitable to keep in registers. Use the same register for multiple variables when it is legal to do so. CMPUT 680 - Compiler Design and Optimization 4

Liveness Intuitively a variable v is live if it holds a value that may be needed in the future. In other words, v is live at a point p i if: (i) v has been defined in a statement that precedes p i in any path, and (ii) v may be used by a statement s j, and there is a path from p i to s j. (iii) v is not killed between p i and s j. CMPUT 680 - Compiler Design and Optimization 5

Live Variables a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1 8 d: s4 = s1-4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s1 s2 A variable v is live between the point p i immediately after its definition and the point p j immediately after its last use. The interval [p i, p j ] is the live range of the variable v. Which variables have the longest live range in the example? Variables s1 and s2 have a live range of four statements. CMPUT 680 - Compiler Design and Optimization 6

Register Allocation a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1 8 d: s4 = s1-4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 How can we find out what is the minimum number of registers required by this basic block to avoid spilling values to memory? We have to compute the live range of all variables and find the fatest statement. Which statements have the most variables live simultaneously? CMPUT 680 - Compiler Design and Optimization 7

Register Allocation a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1 8 d: s4 = s1-4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s1 s2 s3 s4 s5 s6 s7 At statement d variables s1, s2, s3, and s4 are live, and during statement e variables s2, s3, s4, and s5 are live. But we have to use some math: our choice is liveness analysis. CMPUT 680 - Compiler Design and Optimization 8

Live-in and Live-out a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1 8 d: s4 = s1-4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s1 s2 s3 s4 s5 s6 s7 live-in(r): set of variables that are live at the point immediately before statement r. live-out(r): set of variables that are live at the point immediately after statement r. CMPUT 680 - Compiler Design and Optimization 9

Live-in and Live-out: Program Example a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1 8 d: s4 = s1-4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s1 s2 s3 s4 s5 s6 s7 What are live-in(e) and live-out(e)? live-in(e) = {s1, s2, s3, s4} live-out(e) = {s2, s3, s4, s5} CMPUT 680 - Compiler Design and Optimization 10

Live-in and Live-out in Control Flow Graphs The entry point of a basic block B is the point before its first statement. The exit point is the point after its last statement. live-in(b): set of variables that are live at the entry point of the basic block B. live-out(b): set of variables that are live at the exit point of the basic block B. CMPUT 680 - Compiler Design and Optimization 11

Live-in and Live-out of basic blocks B 2 B 1 f := a - d B 4 a := b + c d := d - b e := a + f B 3 b := d + c b := d + f e := a - c b, c, d, e, f live live-in(b 1 )={b,c,d,f} live-in(b 2 )={a,c,d,e} live-in(b 3 )={a,c,d,f} live-in(b 4 )={c,d,e,f} live-out(b 1 )={a,c,d,e,f} live-out(b 2 )={c,d,e,f} live-out(b 3 )={b,c,d,e,f} live-out(b 4 )={b,c,d,e,f} b, d, e, f live Compute live-in and live-out for each basic block CMPUT 680 - Compiler Design and Optimization 12 (Aho-Sethi-Ullman, pp. 544)

Register-Interference Graph A register-interference graph is an undirected graph that summarizes live analysis at the variable level as follows: A node is a variable/temporary that is a candidate for register allocation. An edge connects nodes v1 and v2 if there is some statement in the program where variables v1 and v2 are live simultaneously. (Variables v1 and v2 are said to interfere, in this case). CMPUT 680 - Compiler Design and Optimization 13

Register Interference Graph: Program Example s1 s7 s1 s2 a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1 8 s2 s3 s4 s6 s3 d: s4 = s1-4 e: s5 = s1/2 s5 s6 s5 s4 f: s6 = s2 * s3 g: s7 = s4 - s5 s7 h: s8 = s6 * s7 CMPUT 680 - Compiler Design and Optimization 14

Register Allocation by Graph Coloring Background: A graph is k-colorable if each node has been assigned one of k colors in such a way that no two adjacent nodes have the same color. Basic idea: A k-coloring of the interference graph can be directly mapped to a legal register allocation by mapping each color to a distinct register. The coloring property ensures that no two variables that interfere with each other are assigned the same register. CMPUT 680 - Compiler Design and Optimization 15

Register Allocation by Graph Coloring The basic idea behind register allocation by graph coloring is to 1. Build the register interference graph, 2. Attempt to find a k-coloring for the interference graph. CMPUT 680 - Compiler Design and Optimization 16

Complexity of the Graph Coloring Problem The problem of determining if an undirected graph is k- colorable is NP-hard for k 3. It is also hard to find approximate solutions to the graph coloring problem. CMPUT 680 - Compiler Design and Optimization 17

Register Allocation Question: What to do if a register-interference graph is not k-colorable? Or if the compiler cannot efficiently find a k-coloring even if the graph is k-colorable? Answer: Repeatedly select less profitable variables for spilling (i.e. not to be assigned to registers) and remove them from the interference graph till the graph becomes k- colorable. CMPUT 680 - Compiler Design and Optimization 18

Estimating Register Profitability The register profitability of variable v is estimated by : profitability(v) = freq(i) savings(v, i) i freq(i) : estimated execution frequency of basic block i (obtained by profiling or by static analysis), savings (v, i) : estimated number of processor cycles that would be saved due to a reduced number of load and store instructions in basic block i, if a register was assigned to variable v. CMPUT 680 - Compiler Design and Optimization 19

Heuristic Solution for Graph Coloring Key observation: Remove a node x with degree < k G G from G, and all associated edges Then G is k-colorable if G is k-colorable. CMPUT 680 - Compiler Design and Optimization 20

A 2-Phase Register Allocation Algorithm Build IG Simplify Select and Spill Forward pass Reverse pass CMPUT 680 - Compiler Design and Optimization 21

Heuristic Optimistic Algorithm /* neighbor(v) contains a list of the neighbors of v. */ /* Build step */ Build the register-interference graph, G; /* Forward pass */ Initialize an empty ; repeat while G has a node v such that neighbor(v) < k do /* Simplify step */ Push (v, neighbors(v), no-spill) Delete v and its edges from G end while if G is non-empty then /* Spill step */ Choose least profitable node v as a potential spill node; Push (v, neighbors(v), may-spill) Delete v and its edges from G end if until G is an empty graph; CMPUT 680 - Compiler Design and Optimization 22

Heuristic Optimistic Algorithm /* Reverse Pass */ while the is non-empty do Pop (v, neighbors(v), tag) N := set of nodes in neighbors(v); if (tag = no-spill) then /* Select step */ Select a register R for v such that R is not assigned to nodes in N; Insert v as a new node in G; Insert an edge in G from v to each node in N; else /* tag = may-spill */ if v can be assigned a register R such that R is not assigned to nodes in N then /* Optimism paid off: need not spill */ Assign register R to v; Insert v as a new node in G; Insert an edge in G from from v to each node in N; else /* Need to spill v */ Mark v as not being allocate a register end if end if end while CMPUT 680 - Compiler Design and Optimization 23

Remarks The above register allocation algorithm based on graph coloring is both efficient (linear time) and effective. It has been used in many industrystrength compilers to obtain significant improvements over simpler register allocation heuristics. CMPUT 680 - Compiler Design and Optimization 24

Extensions Coalescing Live range splitting CMPUT 680 - Compiler Design and Optimization 25

Coalescing In the sequence of intermediate level instructions with a copy statement below, assume that registers are allocated to both variables x and y. x :=... y := x... := y There is an opportunity for further optimization by eliminating the copy statement if x and y are assigned the same register. The constraint that x and y receive the same register can be modeled by coalescing the nodes for x and y in the interference graph i.e., by treating them as the same variable. CMPUT 680 - Compiler Design and Optimization 26

An Extension with Coalesce Build IG Simplify Coalesce Select and Spill CMPUT 680 - Compiler Design and Optimization 27

Register Allocation with Coalescing 1. Build: build the register interference graph G and categorize nodes as move-related or non-move-related. 2. Simplify: one at a time, remove non-move-related nodes of low (< K) degree from G. 3. Coalesce: conservatively coalesce G: only coalesce nodes a and b if the resulting a-b node has less than K neighbors. 4. Freeze: If neither coalesce nor simplify works, freeze a move-related node of low degree, making it non-move-related and available for simplify. CMPUT 680 - Compiler Design and Optimization 28 (Appel, pp. 240)

Register Allocation with Coalescing 5. Spill: if there are no low-degree nodes, select a node for potential spilling. 6. Select: pop each element of the assigning colors. (re)build simplify coalesce freeze actual spill select potential spill CMPUT 680 - Compiler Design and Optimization 29 (Appel, pp. 240)

Step 1: Compute Live Ranges k j LIVE-IN: k j g := mem[j+12] g h := k -1 h f := g + h f e := mem[j+8] e m := mem[j+16] m b := mem[f] b c := e + 8 c d := c d k := m + 4 j := b LIVE-OUT: d k j CMPUT 680 - Compiler Design and Optimization 30

Step 3: Simplify (K=4) f e (h,no-spill) j k b m d c h g CMPUT 680 - Compiler Design and Optimization 31

Step 3: Simplify (K=4) f e (g, no-spill) (h, no-spill) j k b m g d c CMPUT 680 - Compiler Design and Optimization 32

Step 3: Simplify (K=4) f e (k, no-spill) (g, no-spill) (h, no-spill) j k b m d c CMPUT 680 - Compiler Design and Optimization 33

Step 3: Simplify (K=4) f e (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) j b m d c CMPUT 680 - Compiler Design and Optimization 34

Step 3: Simplify (K=4) j e b m (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) d c CMPUT 680 - Compiler Design and Optimization 35

Step 3: Simplify (K=4) j b m (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) d c CMPUT 680 - Compiler Design and Optimization 36

Step 3: Coalesce (K=4) j b (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) d c Why we cannot simplify? Cannot simplify move-related nodes. CMPUT 680 - Compiler Design and Optimization 37

Step 3: Coalesce (K=4) j b (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) d c CMPUT 680 - Compiler Design and Optimization 38

Step 3: Simplify (K=4) j c-d b (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) CMPUT 680 - Compiler Design and Optimization 39

Step 3: Coalesce (K=4) j b (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) CMPUT 680 - Compiler Design and Optimization 40

Step 3: Simplify (K=4) b-j (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) CMPUT 680 - Compiler Design and Optimization 41

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 42

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 43

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 44

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 45

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 46

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 47

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 48

Step 3: Select (K=4) f j h k g d e b c m (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 R4 CMPUT 680 - Compiler Design and Optimization 49

Could we do the allocation in the previous example with 3 registers? CMPUT 680 - Compiler Design and Optimization 50

Step 3: Simplify (K=3) f e (h,no-spill) j k b m d c h g CMPUT 680 - Compiler Design and Optimization 51

Step 3: Simplify (K=3) f e (g, no-spill) (h, no-spill) j k b m g d c CMPUT 680 - Compiler Design and Optimization 52

Step 5: Freeze (K=3) f e (g, no-spill) (h, no-spill) j k b m d c Coalescing would make things worse. We can freeze the move d-c. CMPUT 680 - Compiler Design and Optimization 53

Step 3: Simplify (K=3) f e (c, no-spill) (g, no-spill) (h, no-spill) j k b m d c CMPUT 680 - Compiler Design and Optimization 54

Step 6: Spill (K=3) f e (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) j k d b m Neither coalescing nor freezing help us. At this point we should use some profitability analysis to choose a node as may-spill. CMPUT 680 - Compiler Design and Optimization 55

Step 3: Simplify (K=3) f (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) j k b m d CMPUT 680 - Compiler Design and Optimization 56

Step 3: Simplify (K=3) j k b m (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) d CMPUT 680 - Compiler Design and Optimization 57

Step 3: Coalesce (K=3) j k b (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) d CMPUT 680 - Compiler Design and Optimization 58

Step 3: Coalesce (K=3) j-b k (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) d CMPUT 680 - Compiler Design and Optimization 59

Step 3: Coalesce (K=3) j-b k (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) CMPUT 680 - Compiler Design and Optimization 60

Step 3: Coalesce (K=3) j-b (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) CMPUT 680 - Compiler Design and Optimization 61

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 62

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 63

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 64

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 65

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 66

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 This is when our optimism could have paid off. CMPUT 680 - Compiler Design and Optimization 67

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 68

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 69

Step 3: Select (K=3) j h k f g d e b c m (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) R1 R2 R3 CMPUT 680 - Compiler Design and Optimization 70

Live Range Splitting The basic coloring algorithm does not consider cases in which a variable can be allocated to a register for part of its live range. Some compilers deal with this by splitting live ranges within the iteration structure of the coloring algorithm i.e., by pretending to split a variable into two new variables, one of which might be profitably assigned to a register and one of which might not. CMPUT 680 - Compiler Design and Optimization 71

Length of Live Ranges The interference graph does not contain information of where in the CFG variables interfere and what the lenght of a variable s live range is. For example, if we only had few available registers in the following intermediate-code example, the right choice would be to spill variable w because it has the longest live range: x = w + 1 c = a - 2 y = x * 3 z = w + y CMPUT 680 - Compiler Design and Optimization 72

Effect of Instruction Reordering on Register Pressure The coloring algorithm does not take into account the fact that reordering IL instructions can reduce interference. Consider the following example: Original Ordering Optimized Ordering (needs 3 registers) (needs 2 registers) t 1 := A[i] t 2 ;= A[j] t 2 := A[j] t 3 := A[k] t 3 := A[k] t 4 := t 2 * t 3 t 4 := t 2 * t 3 t 1 := A[i] t 5 := t 1 + t 4 t 5 := t 1 + t 4 CMPUT 680 - Compiler Design and Optimization 73

Brief History of Register Allocation Chaitin: Use the simple heuristic for ACM register allocation. Spill/no-spill SIGPLAN decisions are made during the Notices construction phase of the 1982 algorithm Briggs: Finds out that Chaitin s algorithm PLDI spills even when there are available 1989 registers. Solution: the optimistic approach: may-spill during construction, decide at spilling time. CMPUT 680 - Compiler Design and Optimization 74

Brief History of Register Allocation Chow-Hennessy: Priority-based coloring. SIGPLAN Integrate spilling decisions in the 1984 coloring decisions: spill a variable ASPLOS for a limited life range. 1990 Favor dense over sparse use regions. Consider parameter passing convention. Callahan: Hierarchical Coloring Graph, PLDI register preferencing, 1991 profitability of spilling. CMPUT 680 - Compiler Design and Optimization 75