Online Portfolio Selection: A Survey

Similar documents
Morningstar Investor Return

A Liability Tracking Portfolio for Pension Fund Management

A Probabilistic Approach to Worst Case Scenarios

Stock Return Expectations in the Credit Market

Lifecycle Funds. T. Rowe Price Target Retirement Fund. Lifecycle Asset Allocation

Strategic Decision Making in Portfolio Management with Goal Programming Model

Evaluating Portfolio Policies: A Duality Approach

QUANTITATIVE FINANCE RESEARCH CENTRE. Optimal Time Series Momentum QUANTITATIVE FINANCE RESEARCH CENTRE QUANTITATIVE F INANCE RESEARCH CENTRE

Market Timing with GEYR in Emerging Stock Market: The Evidence from Stock Exchange of Thailand

Constructing Absolute Return Funds with ETFs: A Dynamic Risk-Budgeting Approach. July 2008

Evaluating the Performance of Forecasting Models for Portfolio Allocation Purposes with Generalized GRACH Method

Bootstrapping Multilayer Neural Networks for Portfolio Construction

Using Rates of Change to Create a Graphical Model. LEARN ABOUT the Math. Create a speed versus time graph for Steve s walk to work.

Machine Learning for Stock Selection

DYNAMIC portfolio optimization is one of the important

Market timing and statistical arbitrage: Which market timing opportunities arise from equity price busts coinciding with recessions?

Capacity Utilization Metrics Revisited: Delay Weighting vs Demand Weighting. Mark Hansen Chieh-Yu Hsiao University of California, Berkeley 01/29/04

Economics 487. Homework #4 Solution Key Portfolio Calculations and the Markowitz Algorithm

Optimal Portfolio Strategy with Discounted Stochastic Cash Inflows

The t-test. What We Will Cover in This Section. A Research Situation

The Current Account as A Dynamic Portfolio Choice Problem

Can Optimized Portfolios Beat 1/N?

Time-Variation in Diversification Benefits of Commodity, REITs, and TIPS 1

The safe ships trajectory in a restricted area

Asset Allocation with Higher Order Moments and Factor Models

Sources of Over-Performance in Equity Markets: Mean Reversion, Common Trends and Herding

The APT with Lagged, Value-at-Risk and Asset Allocations by Using Econometric Approach

Betting Against Beta

Methods for Estimating Term Structure of Interest Rates

Monte Carlo simulation modelling of aircraft dispatch with known faults

Paul M. Sommers David U. Cha And Daniel P. Glatt. March 2010 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO

Testing Portfolio Efficiency with Non-Traded Assets: Taking into Account Labor Income, Housing and Liabilities

Idiosyncratic Volatility, Stock Returns and Economy Conditions: The Role of Idiosyncratic Volatility in the Australian Stock Market

ANALYSIS OF RELIABILITY, MAINTENANCE AND RISK BASED INSPECTION OF PRESSURE SAFETY VALVES

Performance Attribution for Equity Portfolios

Portfolio Efficiency: Traditional Mean-Variance Analysis versus Linear Programming

KEY CONCEPTS AND PROCESS SKILLS. 1. An allele is one of the two or more forms of a gene present in a population. MATERIALS AND ADVANCE PREPARATION

An Alternative Mathematical Model for Oxygen Transfer Evaluation in Clean Water

Simulation based approach for measuring concentration risk

Proportional Reasoning

James Sefton and Sylvain Champonnois London Quant Conference September 2012

Urban public transport optimization by bus ways: a neural network-based methodology

Semi-Fixed-Priority Scheduling: New Priority Assignment Policy for Practical Imprecise Computation

MODEL SELECTION FOR VALUE-AT-RISK: UNIVARIATE AND MULTIVARIATE APPROACHES SANG JIN LEE

ITG Dynamic Daily Risk Model for Europe

Measuring dynamics of risk and performance of sector indices on Zagreb Stock Exchange

Transit Priority Strategies for Multiple Routes Under Headway-Based Operations

Overreaction and Underreaction : - Evidence for the Portuguese Stock Market -

Revisiting the Growth of Hong Kong, Singapore, South Korea, and Taiwan, From the Perspective of a Neoclassical Model

296 Finance a úvěr-czech Journal of Economics and Finance, 64, 2014, no. 4

Interpreting Sinusoidal Functions

What the Puck? an exploration of Two-Dimensional collisions

Rolling ADF Tests: Detecting Rational Bubbles in Greater China Stock Markets

Overview. Do white-tailed tailed and mule deer compete? Ecological Definitions (Birch 1957): Mule and white-tailed tailed deer potentially compete.

FORECASTING TECHNIQUES ADE 2013 Prof Antoni Espasa TOPIC 1 PART 2 TRENDS AND ACCUMULATION OF KNOWLEDGE. SEASONALITY HANDOUT

Asset and Liability Management, Caisse. a manager of public debt

Arbitrage pricing theory-based Gaussian temporal factor analysis for adaptive portfolio management

Reliability Design Technology for Power Semiconductor Modules

Do Competitive Advantages Lead to Higher Future Rates of Return?

Dynamics of market correlations: Taxonomy and portfolio analysis

Macro Sensitive Portfolio Strategies

Smart Beta Multifactor Construction Methodology: Mixing versus Integrating

What should investors know about the stability of momentum investing and its riskiness? The case of the Australian Security Exchange

FHWA/IN/JTRP-2009/12. Panagiotis Ch. Anastasopoulos Fred L. Mannering John E. Haddock

Guidance Statement on Calculation Methodology

PRESSURE SENSOR TECHNICAL GUIDE INTRODUCTION FEATURES OF ELECTRIC PRESSURE SENSOR. Photoelectric. Sensor. Proximity Sensor. Inductive. Sensor.

Received August 16, 2013; revised September 27, 2013; accepted October 26, 2013

AP Physics 1 Per. Unit 2 Homework. s av

Flexible Seasonal Closures in the Northern Prawn Fishery

On convexity of SD efficiency sets - no short sales case

Measuring Potential Output and Output Gap and Macroeconomic Policy: The Case of Kenya

As time goes by - Using time series based decision tree induction to analyze the behaviour of opponent players

SIMULATION OF WAVE EFFECT ON SHIP HYDRODYNAMICS BY RANSE

Homework 2. is unbiased if. Y is consistent if. c. in real life you typically get to sample many times.

2017 MCM/ICM Merging Area Designing Model for A Highway Toll Plaza Summary Sheet

COVER S UNIVERSAL PORTFOLIO, STOCHASTIC PORTFOLIO THEORY AND THE NUMÉRAIRE PORTFOLIO

Time & Distance SAKSHI If an object travels the same distance (D) with two different speeds S 1 taking different times t 1

Economic Growth with Bubbles

Bill Turnblad, Community Development Director City of Stillwater Leif Garnass, PE, PTOE, Senior Associate Joe DeVore, Traffic Engineer

A Study on the Powering Performance of Multi-Axes Propulsion Ships with Wing Pods

SURFACE PAVEMENT CHARACTERISTICS AND ACCIDENT RATE

Centre for Investment Research Discussion Paper Series. Momentum Profits and Time-Varying Unsystematic Risk

Application of System Dynamics in Car-following Models

Single Index and Portfolio Models for Forecasting Value-at- Risk Thresholds *

Portfolio Strategies Based on Analysts Consensus

Review of Economics & Finance Submitted on 27/03/2017 Article ID: Mackenzie D. Wood, and Jungho Baek

KINEMATICS IN ONE DIMENSION

Momentum profits and time varying unsystematic risk

Avoiding Component Failure in Industrial Refrigeration Systems

The Great Recession in the U.K. Labour Market: A Transatlantic View

Reproducing laboratory-scale rip currents on a barred beach by a Boussinesq wave model

Prepared by: Candice A. Churchwell, Senior Consultant Aimee C. Savage, Project Analyst. June 17, 2014 CALMAC ID SCE0350

Keywords: overfishing, voluntary vessel buy back programs, backward bending supply curve, offshore fisheries in Taiwan

NBER WORKING PAPER SERIES DIVERSIFICATION AND THE OPTIMAL CONSTRUCTION OF BASIS PORTFOLIOS. Bruce N. Lehmann David M. Modest

2. JOMON WARE ROPE STYLES

CALCULATION OF EXPECTED SLIDING DISTANCE OF BREAKWATER CAISSON CONSIDERING VARIABILITY IN WAVE DIRECTION

The Effects of Systemic Risk on the Allocation between Value and Growth Portfolios

San Francisco State University ECON 560 Fall Midterm Exam 2. Tuesday, October hour, 15 minutes

Profitability of Momentum Strategies in Emerging Markets: Evidence from Nairobi Stock Exchange

Simulation Validation Methods

Transcription:

Online Porfolio Selecion: A Survey BIN LI, Wuhan Universiy STEVEN C. H. HOI, Nanyang Technological Universiy Online porfolio selecion is a fundamenal problem in compuaional finance, which has been exensively sudied across several research communiies, including finance, saisics, arificial inelligence, machine learning, and daa mining. This aricle aims o provide a comprehensive survey and a srucural undersanding of online porfolio selecion echniques published in he lieraure. From an online machine learning perspecive, we firs formulae online porfolio selecion as a sequenial decision problem, and hen we survey a variey of sae-of-he-ar approaches, which are grouped ino several major caegories, including benchmarks, Follow-he-Winner approaches, Follow-he-Loser approaches, Paern-Maching based approaches, and Mea-Learning Algorihms. In addiion o he problem formulaion and relaed algorihms, we also discuss he relaionship of hese algorihms wih he capial growh heory so as o beer undersand he similariies and differences of heir underlying rading ideas. This aricle aims o provide a imely and comprehensive survey for boh machine learning and daa mining researchers in academia and quaniaive porfolio managers in he financial indusry o help hem undersand he sae of he ar and faciliae heir research and pracical applicaions. We also discuss some open issues and evaluae some emerging new rends for fuure research. Caegories and Subjec Descripors: J. [Compuer Applicaions]: Adminisraive Daa Processing Financial; J.4[Compuer Applicaions]: Social and Behavioral Sciences Economics; I.2.6 [Arificial Inelligence]: Learning General Terms: Design, Algorihms, Economics Addiional Key Words and Phrases: Machine learning, opimizaion, porfolio selecion ACM Reference Forma: Bin Li and Seven C. H. Hoi. 204. Online porfolio selecion: A survey. ACM Compu. Surv. 46, 3, Aricle 35 January 204), 36 pages. DOI: hp://dx.doi.org/0.45/252962. INTRODUCTION Porfolio selecion, aiming o opimize he allocaion of wealh across a se of asses, is a fundamenal research problem in compuaional finance and a pracical engineering ask in financial engineering. There are wo major schools for invesigaing his problem ha is, he mean-variance heory [Markowiz 952, 959; Markowiz e al. 2000], mainly from he finance communiy, and he Capial Growh Theory CGT) [Kelly 956; Hakansson and Ziemba 995], primarily originaed from informaion heory. The mean-variance heory, widely known in he asse managemen indusry, focuses on a single-period bach) porfolio selecion o rade off a porfolio s expeced reurn mean) 35 This work is fully suppored by Singapore MOE Academic ier- research gran RG33/). More informaion abou our projec of online porfolio selecion is available a hp://olps.sevenhoi.org/. Auhors addresses: B. Li, Deparmen of Finance, School of Economics and Managemen, Wuhan Universiy, P. R. China; email: binli.whu@gmail.com; corresponding auhor: S. C. H. Hoi, School of Compuer Engineering, Nanyang Technological Universiy, Singapore; email: chhoi@nu.edu.sg. Permission o make digial or hard copies of par or all of his work for personal or classroom use is graned wihou fee provided ha copies are no made or disribued for profi or commercial advanage and ha copies show his noice on he firs page or iniial screen of a display along wih he full ciaion. Copyrighs for componens of his work owned by ohers han ACM mus be honored. Absracing wih credi is permied. To copy oherwise, o republish, o pos on servers, o redisribue o liss, or o use any componen of his work in oher works requires prior specific permission and/or a fee. Permissions may be requesed from Publicaions Dep., ACM, Inc., 2 Penn Plaza, Suie 70, New York, NY 02-070 USA, fax + 22) 869-048, or permissions@acm.org. c 204 ACM 0360-0300/204/0-ART35 $5.00 DOI: hp://dx.doi.org/0.45/252962 ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:2 B. Li and S. C. H. Hoi and risk variance), which ypically deermines he opimal porfolios subjec o he invesor s risk-reurn profile. On he oher hand, CGT focuses on muliple-period or sequenial porfolio selecion, aiming o maximize he porfolio s expeced growh rae, or expeced log reurn. Alhough boh heories solve he ask of porfolio selecion, he laer is fied o he online scenario, which naurally consiss of muliple periods and is he focus of his aricle. Online porfolio selecion, which sequenially selecs a porfolio over a se of asses in order o achieve cerain arges, is a naural and imporan ask for asse porfolio managemen. Aiming o maximize he cumulaive wealh, several caegories of algorihms have been proposed o solve his ask. One caegory of algorihms Follow he Winner ries o asympoically achieve he same growh rae expeced log reurn) as ha of an opimal sraegy, which is ofen based on he CGT. The second caegory Follow he Loser ransfers he wealh from winning asses o losers, which seems conradicory o he common sense bu empirically ofen achieves significanly beer performance. Finally, he hird caegory Paern Maching based approaches ries o predic he nex marke disribuion based on a sample of hisorical daa and explicily opimizes he porfolio based on he sampled disribuion. Alhough hese hree caegories are focused on a single sraegy class), here are also some oher sraegies ha focus on combining muliple sraegies classes) Mea-Learning Algorihms MLAs). As a brief summary, Table I oulines he lis of main algorihms and corresponding references. This aricle provides a comprehensive survey of online porfolio selecion algorihms belonging o he menioned caegories. To he bes of our knowledge, his is he firs survey ha includes hese hree caegories and he MLAs as well. Moreover, we are he firs o explicily discuss he connecion beween he online porfolio selecion algorihms and CGT, and illusrae heir underlying rading ideas. In he following secions, we also clarify he scope of his aricle and discuss some relaed exising surveys in he lieraure... Scope In his survey, we focus on discussing he empirical moivaing ideas of he online porfolio selecion algorihms, while only skimming heoreical aspecs e.g., compeiive analysis by El-Yaniv [998] and Borodin e al. [2000] and asympoical convergence analysis by Györfi e al. [202]). Moreover, various oher relaed issues and opics are excluded from his survey, as discussed nex. Firs of all, i is imporan o menion ha he Porfolio Selecion ask in our survey differs from a grea body of financial engineering sudies [Kimoo e al. 993; Merhav and Feder 998; Cao and Tay 2003; Lu e al. 2009; Dhar 20; Huang e al. 20], which aemped o forecas financial ime series by applying machine learning echniques and conduc single sock rading [Kaz and McCormick 2000; Koolen and Vovk 202], such as reinforcemen learning [Moody e al. 998; Moody and Saffell 200; O e al. 2002], neural neworks [Kimoo e al. 993; Dempser e al. 200], geneic algorihms [Mahfoud and Mani 996; Allen and Karjalainen 999; Mandziuk and Jaruszewicz 20], decision rees [Tsang e al. 2004], and suppor vecor machines [Tay and Cao 2002; Cao and Tay 2003; Lu e al. 2009], and boosing and exper weighing [Creamer 2007; Creamer and Freund 2007, 200; Creamer 202]. The key difference beween hese exising works and he subjec area of his survey is ha heir learning goal is o make explici predicions of fuure prices/rends and o rade on a single asse [Borodin e al. 2000, Secion 6], whereas our goal is o direcly opimize he allocaion among a se of asses. Second, his survey emphasizes he imporance of online decision for porfolio selecion, meaning ha relaed marke informaion arrives sequenially and he allocaion decision mus be made immediaely. Due o he sequenial online) naure of his ask, ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:3 Table I. General Classificaion for he Sae-of-he-Ar Online Porfolio Selecion Algorihms Classificaions Algorihms Represenaive References Benchmarks Buy And Hold Bes Sock Consan Rebalanced Porfolios Kelly [956]; Cover [99] Follow he Winner Universal Porfolios Cover [99]; Cover and Ordenlich [996] Exponenial Gradien Helmbold e al. [996, 998] Follow he Leader Gaivoronski and Sella [2000] Follow he Regularized Leader Agarwal e al. [2006] Aggregaing-Type Algorihms Vovk and Wakins [998] Follow he Loser Ani-Correlaion Borodin e al. [2003, 2004] Passive Aggressive Mean Reversion Li e al. [202] Confidence Weighed Mean Reversion Li e al. [20b, 203] Online Moving Average Reversion Li and Hoi [202] Robus Median Reversion Huang e al. [203] Paern-Maching Based Nonparameric Hisogram Log-Opimal Györfi e al. [2006] Approaches Sraegy Nonparameric Kernel-Based Log-Opimal Sraegy Nonparameric Neares Neighbor Györfi e al. [2008] Log-Opimal Sraegy Correlaion-Driven Nonparameric Li e al. [20a] Learning Sraegy Nonparameric Kernel-Based Györfi e al. [2007] Semi-Log-Opimal Sraegy Nonparameric Kernel-Based Oucsák and Vajda [2007] Markowiz-Type Sraegy Nonparameric Kernel-Based GV-Type Györfi and Vajda [2008] Sraegy Mea-Learning Aggregaing Algorihm Vovk [990], [998] Algorihms Fas Universalizaion Algorihm Akcoglu e al. [2002, 2004] Online Gradien Updaes Das and Banerjee [20] Online Newon Updaes Follow he Leading Hisory Hazan and Seshadhri [2009] we mainly focus on he survey of muliperiod/sequenial porfolio selecion work, in which he porfolio is rebalanced o a specified allocaion a he end of each rading period [Cover 99], and he goal ypically is o maximize he expeced log reurn over a sequence of rading periods. We noe ha hese works can be conneced o he CGT [Kelly 956], semmed from he seminal paper of Kelly [956] and furher developed by Breiman [960, 96], Hakansson [970, 97], Thorp [969, 97], Bell and Cover [980], Finkelsein and Whiley [98], Algoe and Cover [988], Barron and Cover [988], MacLean e al. [992], MacLean and Ziemba [999], Ziemba and Ziemba [2007], MacLean e al. [200], and ohers. I has been successfully applied o gambling [Thorp 962, 969, 997], spors being [Hausch e al. 98; Ziemba and Hausch 984, 2008; Thorp 997], and porfolio invesmen [Thorp and Kassouf 967; Roando and Thorp 992; Ziemba 2005]. We hus exclude he sudies relaed o he mean-variance porfolio heory [Markowiz 952, 959], which were ypically developed for single-period bach) porfolio selecion wih he excepion of some exensions [Li and Ng 2000; Dai e al. 200]). Finally, his aricle focuses on surveying he algorihmic aspecs and providing a srucural undersanding of he exising online porfolio selecion sraegies. To preven loss of focus, we will no dig ino heoreical deails. In he lieraure, here is a large body of relaed work for he heory [MacLean e al. 20]. Ineresed researchers can ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:4 B. Li and S. C. H. Hoi explore he deails of he heory from wo exhausive surveys [Thorp 997; MacLean and Ziemba 2008], and is hisory from Poundsone [2005] and Györfi e al. [202, Chaper ]..2. Relaed Surveys There exis several relaed surveys in his area, bu none of hem is comprehensive and imely enough for undersanding he sae-of-he-ar of online porfolio selecion research. For example, El-Yaniv [998, Secion 5] and Borodin e al. [2000] surveyed he online porfolio selecion problem in he framework of compeiive analysis. Using our classificaion in Table I, Borodin e al. mainly surveyed he benchmarks and wo Follow-he-Winner algorihms ha is, Universal Porfolios and Exponenial Gradien refer o he deails in Secion 3.2). Alhough he compeiive framework is imporan for he Follow-he-Winner caegory, boh surveys are oudaed in he sense ha hey do no include a number of sae-of-he-ar algorihms aferward. A recen survey by Györfi e al. [202, Chaper 2] mainly surveyed Paern-Maching based approaches i.e., he hird caegory shown in Table I, which does no include he oher caegories in his area and is hus far from complee)..3. Organizaion The remainder of his aricle is organized as follows. Secion 2 formulaes he problem of online porfolio selecion formally and addresses several pracical issues. Secion 3 inroduces he sae-of-he-ar algorihms, including Benchmarks in Secion 3., he Follow-he-Winner approaches in Secion 3.2, Follow-he-Loser approaches in Secion 3.3, Paern-Maching based approaches in Secion 3.4, and Mea- Learning algorihms in Secion 3.5. Secion 4 connecs he exising algorihms wih he CGT and also illusraes he essenials of heir underlying rading ideas. Secion 5 discusses several relaed open issues, and finally Secion 6 concludes his survey and oulines some fuure direcions. 2. PROBLEM SETTING Consider a financial marke wih m asses, in which we inves our wealh over all asses in he marke for a sequence of n rading periods. The marke price change is represened by a m-dimensional price relaive vecor x R m +, =,...,n, where he i h elemen of h price relaive vecor, x,i, denoes he raio of h closing price o las closing price for he i h asses. Thus, an invesmen in asse i on period increases by a facor of x,i. We also denoe he marke price changes from period o 2 2 > )bya marke window, which consiss of a sequence of price relaive vecors x 2 ={x,...,x 2 }, where denoes he beginning period and 2 denoes he ending period. One special marke window sars from period o n ha is, x n = {x,...,x n }. A he beginning of he h period, an invesmen is specified by a porfolio vecor b, =,...,n. The i h elemen of h porfolio, b,i, represens he proporion of capial invesed in he i h asse. Typically, we assume a porfolio is self-financed, and no margin/shor is allowed. Thus, a porfolio saisfies he consrain ha each enry is nonnegaive and all enries sum up o one ha is, b m, where m ={b : b 0, b = }. Here, is he m-dimensional vecor of all s, and b denoes he inner produc of b and. The invesmen procedure from period o n is represened by a porfolio sraegy, which is a sequence of mappings as follows: b = m, b : R m ) + m, = 2, 3,...,n, ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:5 where b = b x ) denoes he porfolio compued from he pas marke window x. Le us denoe he porfolio sraegy for n periods as b n ={b,...,b n }. For he h period, a porfolio manager apporions is capial according o porfolio b a he opening ime and holds he porfolio unil he closing ime. Thus, he porfolio wealh will increase by a facor of b x = m i= b,ix,i. Since his model uses price relaives and reinvess he capial, he porfolio wealh will increase muliplicaively. From period o n, a porfolio sraegy b n increases he iniial wealh S 0 by a facor of n = b x ha is, he final cumulaive wealh afer a sequence of n periods is S n b n ) = S0 n b x = S 0 = n = i= m b,i x,i. Since he model assumes muliperiod invesmen, we define he exponenial growh rae for a sraegy b n as ) W n b n = n log S ) n b n n = log b x. n Finally, le us combine all elemens and formulae he online porfolio selecion model. In a porfolio selecion ask, he decision maker is a porfolio manager, whose goal is o produce a porfolio sraegy b n in order o achieve cerain arges. Following he principles conveyed by he algorihms in Table I, our arge is o maximize he porfolio cumulaive wealh S n. The porfolio manager compues he porfolio sraegy in a sequenial fashion. On he beginning of period, based on he previous marke window x, he porfolio manager learns a new porfolio vecor b for he coming price relaive vecor x, where he decision crierion varies among differen managers/sraegies. The porfolio b is scored using he porfolio period reurn b x. This procedure is repeaed unil period n, and he sraegy is finally scored according o he porfolio cumulaive wealh S n. Algorihm shows he framework of online porfolio selecion, which serves as a general procedure o backes any online porfolio selecion algorihm. = ALGORITHM : Online porfolio selecion framework. Inpu: x n : Hisorical marke sequence Oupu: S n : Final cumulaive wealh Iniialize S 0 =, b =,..., ) m m for =, 2,...,n do Porfolio manager compues a porfolio b ; Marke reveals he marke price relaive x ; Porfolio incurs period reurn b x and updaes cumulaive reurn S = S b x ) ; Porfolio manager updaes his/her online porfolio selecion rules ; end In general, some assumpions are made in he previous widely adoped model: ) Transacion cos. We assume no ransacion coss/axes in he model. 2) Marke liquidiy. We assume ha one can buy and sell any quaniy of any asse in is closing prices. 3) Impac cos. W assume marke behavior is no affeced by any porfolio selecion sraegy. ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:6 B. Li and S. C. H. Hoi To beer undersand hese noions and he model presened le us illusrae wih a classical example. Example 2. Synheic marke by Cover and Gluss [986]). Assume a wo-asse marke wih cash and one volaile asse wih he price relaive sequence x n = {, 2),, 2 ),, 2),...}. The s price relaive vecor x =, 2) means ha if we inves $ in he firs asse, you will ge $ a he end of period; if we inves $ in he second asse, we will ge $2 afer he period. Le a fixed proporion porfolio sraegy be b n ={ 2, 2 ), 2, ),...}, which means ha 2 every day, he manager redisribues he capial equally among he wo asses. For he s period, he porfolio wealh increases by a facor of 2 + 2 2 = 3. Iniializing he 2 capial wih S 0 =, hen he capial a he end of he s period equals S = S 0 3 2 = 3 2. Similarly, S 2 = S 2 + 2 2 ) = 3 2 3 4 = 9. Thus, a he end of period n, he final 8 cumulaive wealh equals n { ) 9 2 S n b n n is even = 8 n 2 n is odd, 3 2 9 8 and he exponenial growh rae is { ) W n b n = 2 log 9 n is even 8 n 2n log 9 8 + n log 3 n is odd, 2 which approaches 2 log 9 > 0ifnis sufficienly large. 8 2.. Transacion Cos In realiy, he mos imporan and unavoidable issue is ransacion coss. In his secion, we model he ransacion coss ino our formulaion, which enables us o evaluae online porfolio selecion algorihms. However, we will no inroduce sraegies [Davis and Norman 990; Iyengar and Cover 2000; Akian e al. 200; Schäfer 2002; Györfi and Vajda 2008; Ormos and Urbán 20] ha direcly solve he ransacion coss issues. The widely adoped ransacion coss model is he proporional ransacion coss model [Blum and Kalai 999; Györfi and Vajda 2008], in which he incurred ransacion cos is proporional o he wealh ransferred during rebalancing. Le he brokers charge ransacion coss on boh buying and selling. A he beginning of he h period, he porfolio manager inends o rebalance he porfolio from closing price adjused porfolio ˆb o a new porfolio b. Here, ˆb is calculaed as ˆb,i = b,i x,i b x, i =,...,m. Assuming wo ransacion cos raes γ b 0, ) and γ s 0, ), where γ b denoes he ransacion coss rae incurred during buying and γ s denoes he ransacion coss rae incurred during selling. Afer rebalancing, S will be decomposed ino wo pars ha is, he ne wealh N in he new porfolio b and he ransacion coss incurred during he buying and selling. If he wealh on asse i before b rebalancing is higher han ha afer reblancing ha is,,i x,i b x S b,i N hen here will be a selling rebalancing. Oherwise, a buying rebalancing is required. Formally, S = N + γ s m i= b,i x,i b x S b,i N ) + + γ b m i= b,i N b ) +,ix,i S. b x Le us denoe ransacion coss facor [Györfi and Vajda 2008] as he raio of ne wealh afer rebalancing o wealh before rebalancing ha is, c = N S 0, ). Dividing ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:7 he previous equaion by S, we ge m ) b,i x +,i m = c + γ s b,i c + γ b b,i c b ) +,ix,i. ) b x b x i= i= Clearly, given b, x,andb, here exiss a unique ransacion coss facor for each rebalancing. Thus, we can denoe c as a funcion, c = cb, b, x ). Moreover, considering ha he porfolio is in he simplex domain, hen he facor ranges beween γ s +γ b c. Finally, for each period, he wealh grows by a facor as S = S c b x ), and he final cumulaive wealh afer n periods equals n S n = S 0 c ) b x, where c is calculaed as Equaion ). = 3. ONLINE PORTFOLIO SELECTION APPROACHES In his secion, we survey he area of online porfolio selecion. Algorihms in his area formulae he online porfolio selecion ask as in Secion 2 and derive explici porfolio updae schemes for each period. Basically, he rouine is o implicily assume various price relaive predicions and learn opimal porfolios. In he subsequen secions, we mainly lis he algorihms following Table I. In paricular, we firs inroduce several benchmark algorihms in Secion 3.. Then, we inroduce he algorihms wih explici updae schemes in he subsequen hree secions. We classify hem based on he direcion of he weigh ransfer. The firs approach Follow he Winner ries o increase he relaive weighs of more successful expers/socks, ofen based on heir hisorical performance. On he conrary, he second approach Follow he Loser ries o increase he relaive weighs of less successful expers/socks, or ransfer he weighs from winners o losers. The hird approachpaern Maching ries o build a porfolio based on some sampled similar hisorical paerns wih no explici weighs ransfer direcions. Afer ha, we survey MLAs, which can be applied o higher-level expers equipped wih any exising algorihm. 3.. Benchmarks 3... Buy-and-Hold Sraegy. The mos common baseline is he Buy-and-Hold BAH) sraegy, in which one invess wealh among a pool of asses wih an iniial porfolio b and holds he porfolio unil he end. The manager only buys he asses a he beginning of he s period and does no rebalance in he following periods, and he porfolio holdings are implicily changed following he marke flucuaions. For example, a he end of he s period, he porfolio holding becomes b x, where denoes an b x elemenwise produc. In summary, he final cumulaive wealh achieved by a BAH sraegy is he iniial porfolio weighed average of individual socks final wealh, )) n ) S n BAH b = b x. The BAH sraegy wih iniial uniform porfolio b = m,..., ) is referred o as m uniform BAH sraegy, which is ofen adoped as a marke sraegy o produce a marke index. = ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:8 B. Li and S. C. H. Hoi 3..2. Bes Sock Sraegy. Anoher widely adoped benchmark is he Bes Sock Bes) sraegy, which is a special BAH sraegy ha pus all capial on he sock wih bes performance in hindsigh. Clearly, is iniial porfolio b in hindsigh can be calculaed as n ) b = arg max b x. b m = As a resul, he final cumulaive wealh achieved by he Bes sraegy can be calculaed as n ) S n Bes) = max b x = S n BAH b )). b m = 3..3. Consan Rebalanced Porfolios. Anoher more sophisicaed benchmark sraegy is he Consan Rebalanced Porfolio CRP) sraegy, which rebalances he porfolio o a fixed porfolio b every period. In paricular, he porfolio sraegy can be represened as b n ={b, b,...}. Thus, he cumulaive porfolio wealh achieved by a CRP sraegy afer n periods is defined as n S n CRPb)) = b x. One special CRP sraegy ha rebalances o uniform porfolio b = m,..., m ) each period is he Uniform Consan Rebalanced Porfolio UCRP). I is possible o calculae an opimal offline porfolio for he CRP sraegy as b = arg max log S n CRPb)) = arg max logb x ), b n m b m = which is convex and can be efficienly solved. The CRP sraegy wih b is denoed by Bes Consan Rebalanced Porfolio BCRP). BCRP achieves a final cumulaive porfolio wealh and corresponding exponenial growh rae defined as follows: S n BCRP) = max S n CRPb)) = S n CRPb )), b m = W n BCRP) = n log S nbcrp) = n log S ncrpb )). Noe ha BCRP sraegy is a hindsigh sraegy, which can only be calculaed wih complee marke sequences. Cover [99] proved he benefis of BCRP as a arge ha is, BCRP exceeds he bes sock, Value Line Index geomeric mean of componen reurns), and he Dow Jones Index arihmeic mean of componen reurns, or BAH). Moreover, BCRP is invarian under permuaions of he price relaive sequences ha is, i does no depend on he order in which x, x 2,...,x n occur. Unil now, le us compare BAH and CRP sraegies by coninuing Example 2.. Example 3. Synheic Marke by Cover and Gluss [986]). Assume a wo-asse marke wih cash and one volaile asse wih he price relaive sequence x n = {, 2),, 2 ),, 2),...}. Le us consider BAH wih uniform iniial porfolio b = 2, 2 ) and he CRP wih uniform porfolio b = 2, ). Clearly, since no asse grows in he long 2 run, he final wealh of BAH equals he uniform weighed summaion of wo asses, which roughly equals o in he long run. On he oher hand, according o he analysis of Example 2., he final cumulaive wealh of CRP is roughly 9 n 2, which increases 8 n ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:9 exponenially. Noe ha he BAH only rebalances on he s period, whereas he CRP rebalances every period. While marke provides no reurn on he synheic marke, CRP can produce an exponenially increasing reurn. The underlying idea of CRP is o ake advanage of he underlying volailiy, or so-called volailiy pumping [Luenberger 998, Chaper 5]. Since CRP rebalances a fixed porfolio each period, is frequen ransacions will incur high ransacion coss. Helmbold e al. [996, 998] proposed a Semi-Consan Rebalanced Porfolio Semi-CRP), which rebalances he porfolio on seleced periods raher han every period. One desired heoreical resul for online porfolio selecion is universaliy [Cover 99]. An online porfolio selecion algorihm Alg is universal if is average exernal) regre [Solz and Lugosi 2005; Blum and Mansour 2007] for n periods asympoically approaches 0, n regre nalg) = W n BCRP) W n Alg) 0, as n. 2) In oher words, a universal porfolio selecion algorihm asympoically approaches he same exponenial growh rae as BCRP sraegy for arbirary sequences of price relaives. 3.2. Follow-he-Winner Approaches Follow he Winner is characerized by increasing he relaive weighs of more successful expers/socks. Raher han argeing marke and bes sock, algorihms in his caegory ofen aim o rack he BCRP sraegy, which can be shown o be he opimal sraegy in an i.i.d. marke [Cover and Thomas 99, Theorem 5.3.]. In oher words, such opimaliy moivaes ha universal porfolio selecion algorihms approach he performance of he hindsigh BCRP for arbirary sequence of price relaive vecors, called individual sequences. 3.2.. Universal Porfolios. The basic idea of Universal Porfolio ype algorihms is o assign he capial o a single class of base expers, le he expers run, and finally pool heir wealh. Sraegies in his ype are analogous o he BAH sraegy. Their difference is ha base BAH exper is he sraegy invesing on a single sock, and hus he number of expers is he same as ha of socks. In oher words, BAH sraegy buys he individual socks and les he socks go and finally pools heir individual wealh. On he oher hand, he base exper in he Follow-he-Winner caegory can be any sraegy class ha invess in any se of socks in he marke. Besides, algorihms in his caegory are also similar o he MLAs furher described in Secion 3.5, alhough MLA generally applies o expers of muliple classes. Cover [99] proposed he Universal Porfolio UP) sraegy, and Cover and Ordenlich [996] furher refined he algorihm as μ-weighed Universal Porfolio, in which μ denoes a given disribuion on he space of valid porfolio m. Inuiively, Cover s UP operaes similar o a Fund of Funds FOF), and is main idea is o BAH he parameerized CRP sraegies over he whole simplex domain. In paricular, i iniially invess a proporion of wealh dμb) o each porfolio manager operaing CRP sraegy wih b m, and les he CRP managers run. Then, a he end, each manager will grow his wealh o S n b)dμb). Finally, Cover s UP pools he individual expers wealh over he coninuum of porfolio sraegies. Noe ha S n b) = e nw nb), which means ha he porfolio grows a an exponenial rae of W n b). ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:0 B. Li and S. C. H. Hoi Formally, is updae scheme [Cover and Ordenlich 996, Definiion ] can be inerpreed as a hisorical performance weighed average of all valid CRPs, b + = m bs b)dμb) m S b)dμb). Noe ha a he beginning of period +, one CRP manager s wealh hisorical performance) equals o S b)dμb). Incorporaing he iniial wealh of S 0 =, he final cumulaive wealh is he weighed average of CRP managers wealh [Cover and Ordenlich 996, Eq. 24)], S n UP) = S n b)dμb). 3) m One special case is ha μ follows a uniform disribuion, hen he porfolio updae reduces o Cover s UP [Cover 99, Eq..3)]. Anoher special case is Dirichle 2,..., 2 ) weighed UP [Cover and Ordenlich 996], which is proved o be a more opimal allocaion. Alernaively, if a loss funcion is defined as he negaive logarihmic funcion of porfolio reurn, Cover s UP is acually an exponenially weighed average forecaser [Cesa-Bianchi and Lugosi 2006]. Cover [99] showed ha wih suiable smoohness condiions he average of exponenials grows a he same exponenial rae as he maximum, hus one can asympoically approach BCRP s exponenial growh rae. The regre achieved by Cover s UP is Omlog n), and is ime complexiy is On m ), where m denoes he number of socks and n refers o he number of periods. Cover and Ordenlich [996] proved ha he 2,..., ) weighed Universal Porfolios has he same scale of regre bound bu a beer 2 consan erm [Cover and Ordenlich 996, Theorem 2]. As Cover s UP is based on an ideal marke model, one research opic wih respec o Cover s UP is o exend he algorihm wih various realisic assumpions. Cover and Ordenlich [996] exended he model o include side informaion, which can be insaniaed expers opinions, fundamenal daa, and so forh. Blum and Kalai [999] ook accoun of ransacion coss for online porfolio selecion and proposed a universal porfolio algorihm o handle he coss. Anoher research opic is o generalize Cover s UP wih differen underlying base exper classes, raher han he CRP sraegy. Jamshidian [992] generalized he algorihm for coninuous ime marke and derived he long-erm performance of Cover s UP in his seing. Vovk and Wakins [998] applied Aggregaing Algorihm AA) [Vovk 990] o a finie number of arbirary invesmen sraegies. Cover s UP becomes a specialized case of AA when applied o an infinie number of CRPs. We will furher invesigae AA in Secion 3.2.5. Ordenlich and Cover [998] derived he lower bound of he final wealh achieved by any nonanicipaing invesmen sraegy o ha of BCRP sraegy. Cross and Barron [2003] generalized Cover s UP from CRP sraegy class o any parameerized arge class and proposed a universal sraegy ha coss a polynomial ime. Akcoglu e al. [2002, 2004] exended Cover s UP from he parameerized CRP class o a wide class of invesmen sraegies, including rading sraegies operaing on a single sock and porfolio sraegies operaing on he whole sock marke. Koza and Singer [20] proposed a similar universal algorihm based on he class of Semi-CRPs [Helmbold e al. 996, 998], which provides good performance wih ransacion coss. Raher han he previous analysis, various work has also been proposed o discuss he connecion beween Cover s UP wih universal predicion [Feder e al. 992], daa compression [Rissanen 983] and Markowiz s mean-variance heory [Markowiz 952, 959]. Algoe [992] discussed he universal schemes for predicion, gambling, and porfolio selecion. Cover [996] and Ordenlich [996] discussed he connecion ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35: of universal porfolio selecion and daa compression. Belenepe [2005] presened a saisical view of Cover s UP sraegy and conneced i wih radiional Markowiz s mean-variance porfolio heory [Markowiz 952]. The auhors showed ha by allowing shor selling and leverage, UP is approximaely equivalen o sequenial mean-variance opimizaion; oherwise, he sraegy is approximaely equivalen o consrained sequenial opimizaion. Alhough is updae scheme is disribuional free, UP implicily esimaes he mulivariae mean and covariance marix. Cover s UP has a good heoreical performance guaranee; however, is implemenaion coss exponenial ime in he number of asses, which resrics is pracical capabiliy. To overcome his compuaional boleneck, Kalai and Vempala [2002] presened an efficien implemenaion based on nonuniform random walks ha are rapidly mixing. Their implemenaion requires a poly running ime of Om 7 n 8 ), which is a subsanial improvemen of he original bound of On m ). 3.2.2. Exponenial Gradien. The sraegies in he Exponenial Gradien ype generally focus on he following opimizaion problem: b + = arg max η log b x Rb, b ), 4) b m where Rb, b ) denoes a regularizaion erm and η > 0 is he learning rae. One sraighforward inerpreaion of he opimizaion is o rack he sock wih he bes performance in he las period bu keep he new porfolio close o he previous porfolio. This is obained using he regularizaion erm Rb, b ). Helmbold e al. [996, 998] proposed he Exponenial Gradien EG) sraegy, which is based on he algorihm proposed for he mixure esimaion problem [Helmbold e al. 997]. The EG sraegy employs he relaive enropy as he regularizaion erm in Equaion 4), m Rb, b ) = b i log b i. b,i i= EG s formulaion is hus convex in b,however, i is hard o solve since log is nonlinear. Thus, he auhors adoped log s firs-order Taylor expansion a b, log b x logb x ) + x b b ), b x wih which he firs erm in Equaion 4) becomes linear and easy o solve. Solving he opimizaion, he updae rule [Helmbold e al. 998, Equaion 3.3)] becomes b +,i = b,i exp η x ),i /Z, i =,...,m, b x where Z denoes he normalizaion erm such ha he porfolio sums o. The opimizaion problem in Equaion 4) can also be solved using he Gradien Projecion GP) and Expecaion Maximizaion EM) mehod [Helmbold e al. 997]. GP and EM adop differen regularizaion erms. In paricular, GP adops L2-norm regularizaion, and EM adops χ 2 regularizaion: { m 2 i= Rb, b ) = b i b,i ) 2 GP m b i b,i ) 2 2 i= b,i EM. The final updae rule of GP [Helmbold e al. 997, Eq. 5)] is ) x,i b +,i = b,i + η m x,i, b x m b x i= ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:2 B. Li and S. C. H. Hoi and he updae rule of EM [Helmbold e al. 997, Eq. 7)] is ) ) x,i b +,i = b,i η +, b x which can also be viewed as he firs-order approximaion of EG s updae formula. The regre of he EG sraegy can be bounded by O nlog m) wihom) running ime per period. The regre is no as igh as ha of Cover s UP; however, is linear running ime subsanially surpasses ha of Cover s UP. Besides, he auhors also proposed a varian, which has a regre bound of Om 0.5 log m) 0.25 n 0.75 ). Alhough no proposed for online porfolio selecion ask, according o Helmbold e al. [997], GP can sraighforwardly achieve a regre of O mn), which is significanly worse han ha of EG. One key parameer for EG is he learning rae η>0. In order o achieve he desired regre bound presened previously, η has o be small. However, as η 0, is updae approaches uniform porfolio, and EG reduces o UCRP. Das and Banerjee [20] exended he EG algorihm o he sense of he MLA named Online Gradien Updaes OGU), which will be inroduced in Secion 3.5.3. OGU combines underlying expers such ha he overall sysem can achieve he performance ha is no worse han any convex combinaion of base expers. To handle he case of nonzero ransacion coss, Das e al. [203] exended he GP algorihm by appending a L regularizaion o formulaion, and proposed Online Lazy Updae OLU) algorihm. 3.2.3. Follow he Leader. Sraegies in he Follow-he-Leader FTL) approach ry o rack he BCRP sraegy unil ime ha is, b + = b = arg max log ) b x j. 5) b m j= Clearly, his caegory follows he BCRP leader, and he ulimae leader is he BCRP over all periods. Ordenlich [996, Chaper 4.4] briefly menioned a sraegy o obain porfolios by mixing he BCRP up o ime and uniform porfolio b + = + b + + m. He also showed is wors case bound, which is slighly worse han ha of Cover s UP. Gaivoronski and Sella [2000] proposed Successive Consan Rebalanced Porfolios SCRP) and Weighed Successive Consan Rebalanced Porfolios WSCRP) for saionary markes. For each period, SCRP direcly adops he BCRP porfolio unil now, ha is, b + = b. The auhors furher solved he opimal porfolio b via sochasic opimizaion [Birge and Louveaux 997], resuling in he deail updaes of SCRP [Gaivoronski and Sella 2000, Algorihm ]. On he oher hand, WSCRP oupus a convex combinaion of SCRP porfolio and las porfolio, b + = γ ) b + γ b, where γ [0, ] represens he rade-off parameer. The regre bounds achieved by SCRP [Gaivoronski and Sella 2000, Theorem ] and WSCRP [Gaivoronski and Sella 2000, Theorem 4] are boh OK 2 log n), where K is a ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:3 uniform upper bound of he gradien of log b x wih respec o b. I is sraighforward o see ha given he same assumpion of upper/lower bound of price relaives as Cover s UP [Cover 99, Theorem 6.], he regre bound is on he same scale of Cover s UP, alhough he consan erm is slighly worse. Raher han assuming ha he hisorical marke is saionary, some algorihms assume ha he hisorical marke is nonsaionary. Gaivoronski and Sella [2000] propose Variable Rebalanced Porfolios VRP), which calculaes he BCRP porfolio based on a laes sliding window. To be more specific, VRP updaes is porfolio as follows: b + = arg max logb x j ), b m j= W+ where W denoes a specified window size. Following heir algorihms for CRP, hey furher proposed Successive Variable Rebalanced Porfolios SVRP) and Weighed Successive Variable Rebalanced Porfolios WSVRP). No heoreical resuls were given on he wo algorihms. Gaivoronski and Sella [2003] furher generalized Gaivoronski and Sella [2000] and proposed Adapive Porfolio Selecion APS) for online porfolio selecion ask. By changing he objecive par, APS can handle hree ypes of porfolio selecion ask ha is, adapive Markowiz porfolio, log-opimal CRP, and index racking. To handle he ransacion cos issue, hey proposed Threshold Porfolio Selecion TPS), which only rebalances he porfolio if he expeced reurn of new porfolio exceeds ha of a previous porfolio for more han a hreshold. 3.2.4. Follow he Regularized Leader. Anoher caegory of approaches follows a similar idea as FTL, while adding a regularizaion erm, and hus acually becomes he Follow he Regularized Leader FTRL) approach. In generally, FTRL approaches can be formulaed as follows: b + = arg max b m τ= logb x τ ) β Rb), 6) 2 where β denoes he rade-off parameer and Rb) is a regularizaion erm on b. Noe ha here all hisorical informaion is capured in he firs erm, hus he regularizaion erm only concerns he nex porfolio, which is differen from he EG algorihm. One ypical regularizaion is an L2-norm ha is, Rb) = b 2. Agarwal e al. [2006] proposed he Online Newon Sep ONS) by solving he opimizaion problem in Equaion 6) wih L2-norm regularizaion via online convex opimizaion echnique [Zinkevich 2003; Hazan e al. 2006, 2007; Hazan 2006]. Similar o he Newon mehod for offline opimizaion, he basic idea is o replace he log erm via is second-order Taylor expansion a b, hen solve he problem for closed-form updae scheme. Finally, he ONS updae rule [Agarwal e al. 2006, Lemma 2] is wih A = b = τ= m,..., m ) x τ x τ bτ x τ ) 2 ) + I m, p =, b + = A m δa p ), + ) β τ= x τ b τ x τ, where β is he rade-off parameer, δ is a scale erm, and A m ) is an exac projecion o he simplex domain. ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:4 B. Li and S. C. H. Hoi ONS ieraively updaes he firs- and second-order informaion and he porfolio wih a ime cos of Om 3 ), which is irrelevan o he number of hisorical insances. The auhors also proved ONS s regre bound [Agarwal e al. 2006, Theorem ] of Om.5 logmn)), which is worse han Cover s UP and Dirichle /2) weighed UP. While FTRL or even he Follow-he-Winner caegory mainly focuses on he wors-case invesing, Hazan and Kale [2009, 202] linked he wors-case model wih pracically widely used average-case invesing ha is, he Geomeric Brownian Moion GBM) model [Bachelier 900; Osborne 959; Cooner 964], which is a probabilisic model of sock reurns. The auhors also designed an invesmen sraegy ha is universal in he wors case and is capable of exploiing he GBM model. Their algorihm, or so-called Exp-Concave-FTL, follows a slighly differen form of opimizaion problem 6) wih L2-norm regularizaion, b + = arg max b m logb x τ ) 2 b 2. τ= Similar o ONS, he opimizaion problem can be efficienly solved via online convex opimizaion echnique. The auhors furher analyzed is regre bound and linked i wih he GBM model. Linking he GBM model, he regre round [Hazan and Kale 202, Theorem. and Corollary.2] is O mlog Q + m)), where Q denoes he quadraic variabiliy, calculaed as n imes he sample variance of he sequence of price relaive vecors. Since Q is ypically much smaller han n, he regre bound significanly improves he O mlog n) bound. Besides he improved regre bound, he auhors also discussed he relaionship of heir algorihm s performance o rading frequency. The auhors assered ha increasing he rading frequency would decrease he variance of he minimum variance CRP ha is, he more frequenly hey rade, he more likely he payoff will be close o he expeced value. On he oher hand, he regre says he same even if hey rade more. Consequenly, i is expeced o see improved performance of such algorihm as he rading frequency increases [Agarwal e al. 2006]. Das and Banerjee [20] furher exended he FTRL approach o a generalized MLA ha is, Online Newon Updae ONU), which guaranees ha he overall performance is no worse han any convex combinaion of is underlying expers. 3.2.5. Aggregaing-Type Algorihms. Alhough BCRP is he opimal sraegy for an i.i.d. marke,he i.i.d. assumpion is conroversial in real markes, so he opimal porfolio may no belong o CRP or fixed fracion porfolio. Some algorihms have been designed o rack a differen se of expers. The algorihms in his caegory share similar idea o he MLAs in Secion 3.5. However, here he base expers are of a special class ha is, individual exper ha invess fully on a single sock alhough in general MLAs ofen apply o more complex expers from muliple classes. Vovk and Wakins [998] applied he AA [Vovk 990, 997, 999, 200] o he online porfolio selecion ask, of which Cover s UP is a special case. The general seing for AA is o define a counable or finie se of base expers and sequenially allocae he resource among muliple base expers in order o achieve a good performance ha is no worse han any fixed combinaion of underlying expers. Alhough is general form is shown in Secion 3.5., is porfolio updae formula [Vovk and Wakins 998, Algorihm ] for online porfolio selecion is b + = m b i= b x ) η P 0 d b) m i= b x ) η P 0 d b), ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:5 where P 0 d b) denoes he prior weighs of he expers. As a special case, Cover s UP corresponds o AA wih uniform prior disribuion and η =. Singer [997] proposed Swiching Porfolios SP) o rack a changing marke, in which he sock s behaviors may change frequenly. Raher han he CRP class, SP decides a se of basic sraegies for example, he pure sraegy ha invess all wealh in one asse and chooses a prior disribuion over he se of sraegies. Based on he acual reurn of each sraegy and he prior disribuion, SP is able o selec a porfolio for each period. Upon his procedure, he auhor proposed wo algorihms, boh of which assume ha he duraion of using a basic sraegy follows geomeric disribuion wih a parameer of γ, which can be fixed or varied in ime. Wih fixed γ, he firs version of SP has an explici updae formula [Singer 997, Eq. 6)], b + = γ γ m ) b + γ m. Wih a varying γ, SP has no explici updae. The auhor also adoped he algorihm for ransacion coss. Theoreically, he auhor furher gave he lower bound of SP s logarihmic wealh wih respec o any underlying swiching regime in hindsigh [Singer 997, Theorem 2]. Empirical evaluaion on Cover s wo-sock pairs shows ha SP can ouperform UP, EG, and BCRP, in mos cases. Levina and Shafer [2008] proposed he Gaussian Random Walk GRW) sraegy, which swiches among he base expers according o Gaussian disribuion. Koza and Singer [2007] exended SP o piecewise fixed fracion sraegies, which pariions he periods ino differen segmens and ransis among hese segmens. The auhors proved he piecewise universaliy of heir algorihm, which can achieve he performance of he opimal piecewise fixed fracion sraegy. Koza and Singer [2008] exended Koza and Singer [2007] o he case of ransacion coss. Koza and Singer [2009, 200] furher generalized Koza and Singer [2007] o sequenial decision problem. Koza e al. [2008] proposed anoher piecewise universal porfolio selecion sraegy via conex rees, and Koza e al. [20] generalized o sequenial decision problem via ree weighing. The mos ineresing hing is ha swiching porfolios adops he noion of regime swiching [Hamilon 994, 2008], which is differen from he underlying assumpion of universal porfolio selecion mehods and seems o be more plausible han an i.i.d. marke. The regime swiching is also applied o some sae-of-he-ar rading sraegies [Hardy 200; Mlnařĺk e al. 2009]. However, his approach suffers from is disribuion assumpion, because geomeric and Gaussian disribuions do no seem o fi he marke well [Mandelbro 963; Con 200]. This leads o oher poenial disribuions ha can beer model he markes. 3.3. Follow-he-Loser Approaches The underlying assumpion for he opimaliy of BCRP sraegy is ha marke is i.i.d., which however does no always hold for he real-world daa and hus ofen resuls in inferior empirical performance, as observed in various previous works. Insead of racking he winners, he Follow-he-Loser approach is ofen characerized by ransferring he wealh from winners o losers. The underlying of his approach is mean reversion [Bond and Thaler 985; Poerba and Summers 988; Lo and MacKinlay 990], which means ha he good poor)-performing asses will perform poor good) in he following periods. To beer undersand he mean reversion principle, le us furher analyze he behaviors of CRP in Example 3. [Li e al. 202]. Example 3.2 Synheic Marke by Cover and Gluss [986]). As illusraed in Example 3., uniform CRP grows exponenially on he synheic marke. Now we ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

35:6 B. Li and S. C. H. Hoi Table II. Example o Illusrae he Mean Reversion Trading Idea #Period Price Relaive A,B) CRP ) CRP Reurn Porfolio Holdings ) Noes, 2) 2, 3 2 2 3, 2 3 B A ) ) ) 2, 2 2, 3 2 2 4 3, 3 A B ) ) 3, 2) 2, 3 2 2 3, 2 3 B A...... analyze is porfolio updae behaviors, which follows mean reversion, as shown in Table II. Suppose ha he iniial CRP porfolio is 2, 2 ) and ha a he end of he s period, he closing price adjused porfolio holding becomes 3, 2 ) and corresponding cumulaive 3 wealh increases by a facor of 3 2. A he beginning of he 2nd period, he CRP manager rebalances he porfolio o iniial uniform porfolio by ransferring he wealh from good-performing sock B) o poor-performing sock A), which acually follows he mean reversion principle. Then, is cumulaive wealh changes by a facor of 3 and he 4 porfolio holding a he end of he 2 nd period becomes 2 3, ). A he beginning of he 3rd 3 period, he wealh ransfer wih he mean reversion idea coninues. In summary, CRP implicily assumes ha if one sock performs poor good), i ends o perform good poor) in he subsequen period and hus ransfers he weighs from good-performing socks o poor-performing socks. 3.3.. Ani-Correlaion. Borodin e al. [2003, 2004] proposed a Follow-he-Loser porfolio sraegy named Ani-Correlaion Anicor) sraegy. Raher han no disribuional assumpion like Cover s UP, Anicor sraegy assumes ha marke follows he mean reversion principle. To exploi he mean reversion propery, i saisically makes a be on he consisency of posiive lagged cross-correlaion and negaive auocorrelaion. To obain a porfolio for he + s period, he Anicor algorihm adops logarihmic price relaives [Hull 2008] in wo specific marke windows ha is, y = logx w 2w+ ) and y 2 = logx w+ ). I hen calculaes he cross-correlaion marix beween y and y 2 : M cov i, j) = ) ) y,i ȳ y2, j ȳ 2 w M cor i, j) = { Mcovi, j) σ i) σ 2 j) σ i),σ 2 j) 0 0 oherwise Then, according o he cross-correlaion marix, Anicor algorihm ransfers he wealh according o he mean reversion rading idea, or moves he proporions from he socks increased more o he socks increased less, and he corresponding amouns are adjused according o he cross-correlaion marix. In paricular, if asse i increases more han asse j and heir sequences in he window are posiively correlaed, Anicor claims a ransfer from asse i o j wih he amoun equals he cross correlaion value M cor i, j)) minus heir negaive auocorrelaion values min {0, M cor i, i)} and min {0, M cor j, j)}). These ransfer claims are finally normalized o keep he porfolio in he simplex domain. Because of is mean reversion naure, i is difficul o obain a useful bound such as he universal regre bound. Alhough heurisic and has no heoreical guaranee, Anicor empirically ouperforms all oher sraegies a he ime. On he oher hand, alhough Anicor algorihm obains good performance ouperforming all algorihms a he ime, is heurisic naure canno fully exploi he mean reversion propery. Thus, exploiing he propery using sysemaic learning algorihms is highly desired. ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.

Online Porfolio Selecion: A Survey 35:7 3.3.2. Passive Aggressive Mean Reversion. Li e al. [202] proposed Passive Aggressive Mean Reversion PAMR) sraegy, which explois he mean reversion propery wih he Passive Aggressive PA) online learning [Shalev-Shwarz e al. 2003; Crammer e al. 2006]. The main idea of PAMR is o design a loss funcion in order o reflec he mean reversion propery ha is, if he expeced reurn based on las price relaive is larger han a hreshold, he loss will linearly increase; oherwise, he loss is zero. In paricular, he auhors defined he ɛ-insensiive loss funcion for he h period as { ) 0 b x ɛ l ɛ b; x = b x ɛ oherwise, where 0 ɛ is a sensiiviy parameer o conrol he mean reversion hreshold. Based on he loss funcion, PAMR passively mainains he las porfolio if he loss is zero; oherwise, i aggressively approaches a new porfolio ha can force he loss zero. In summary, PAMR obains he nex porfolio via he following opimizaion problem: b + = arg min b b 2 s.. l ɛ b; x ) = 0. 7) b m 2 Solving he opimizaion problem 7), PAMR has a clean closed form updae scheme [Li e al. 202, Proposiion ]: { b + = b τ x x ), τ = max 0, b } x ɛ. x x 2 Since he auhors ignored he nonnegaiviy consrain of he porfolio in he derivaion, hey also added a simplex projecion sep [Duchi e al. 2008]. The closed form updae scheme clearly reflecs he mean reversion rading idea by ransferring he wealh from he good-performing socks o he poor-performing socks. I also coincides wih he general form [Lo and MacKinlay 990, Eq. )] of reurn-based conrarian sraegies [Conrad and Kaul 998; Lo 2008], excep an adapive muliplier τ. Besides he opimizaion problem 7), he auhors also proposed wo varians o avoid noise price relaives by inroducing some nonnegaive slack variables ino opimizaion, which is similar o sof margin suppor vecor machines. Similar o he Anicor algorihm, due o PAMR s mean reversion naure, i is hard o obain a meaningful heoreical regre bound. Neverheless, PAMR achieves significan performance, beaing all algorihms a he ime, and shows is robusness along wih he parameers. I also enjoys linear updae ime and runs exremely fas in he back ess, which show is pracicabiliy o large-scale real-world applicaion. The underlying idea is o exploi he single-period mean reversion, which is empirically verified by is evaluaions on several real marke daases. However, PAMR suffers from drawbacks in risk managemen since i suffers significan performance degradaion if he underlying single-period mean reversion fails o exis. Such drawback is clearly indicaed by is performance in DJIA daase [Borodin e al. 2003, 2004; Li e al. 202]. 3.3.3. Confidence Weighed Mean Reversion. Li e al. [20b] proposed he Confidence Weighed Mean Reversion CWMR) algorihm o furher exploi he second-order porfolio informaion, which refers o he variance of porfolio weighs no price or price relaive), following he mean reversion rading idea via Confidence Weighed CW) online learning [Dredze e al. 2008; Crammer e al. 2008, 2009; Dredze e al. 200]. The basic idea of CWMR is o model he porfolio vecor as a mulivariae Gaussian disribuion wih mean μ R m and he diagonal covariance marix R m m,which ACM Compuing Surveys, Vol. 46, No. 3, Aricle 35, Publicaion dae: January 204.