Electronic Supplementary Material: Goals as Reference Points in Marathon Running: A Novel Test of Reference Dependence

Similar documents
Online Appendix: Goals as Reference Points in Marathon Running: A Novel Test of Reference-Dependence

Additional Material for Harnessing Optimism: How Eliciting Goals Improves Performance (Sackett, Wu, White, & Markle, 2014)

Running head: DATA ANALYSIS AND INTERPRETATION 1

When Falling Just Short is a Good Thing: the Effect of Past Performance on Improvement.

Legendre et al Appendices and Supplements, p. 1

Evaluating the Influence of R3 Treatments on Fishing License Sales in Pennsylvania

A SURVEY OF 1997 COLORADO ANGLERS AND THEIR WILLINGNESS TO PAY INCREASED LICENSE FEES

Compression Study: City, State. City Convention & Visitors Bureau. Prepared for

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

NBA TEAM SYNERGY RESEARCH REPORT 1

Revisiting the Hot Hand Theory with Free Throw Data in a Multivariate Framework

Wildlife Ad Awareness & Attitudes Survey 2015

The probability of winning a high school football game.

Addendum to SEDAR16-DW-22

Chapter 12 Practice Test

Predicting the Total Number of Points Scored in NFL Games

The Right Invitation: A Comprehensive Research Study to Guide the Golf Industry to Meaningfully Increase Women s Golf Participation and Satisfaction

Journal of Human Sport and Exercise E-ISSN: Universidad de Alicante España

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

Analyses of the Scoring of Writing Essays For the Pennsylvania System of Student Assessment

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Stats 2002: Probabilities for Wins and Losses of Online Gambling

Discussion on the Selection of the Recommended Fish Passage Design Discharge

Round Numbers as Goals Evidence From Baseball, SAT Takers, and the Lab

Is Tiger Woods Loss Averse? Persistent Bias in the Face of Experience, Competition, and High Stakes. Devin G. Pope and Maurice E.

Predictors for Winning in Men s Professional Tennis

Racial Bias in the NBA: Implications in Betting Markets

Navigate to the golf data folder and make it your working directory. Load the data by typing

Section I: Multiple Choice Select the best answer for each problem.

Should bonus points be included in the Six Nations Championship?

Chapter 5: Methods and Philosophy of Statistical Process Control

Age of Fans

Review of A Detailed Investigation of Crash Risk Reduction Resulting from Red Light Cameras in Small Urban Areas by M. Burkey and K.

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Has the NFL s Rooney Rule Efforts Leveled the Field for African American Head Coach Candidates?

the 54th Annual Conference of the Association of Collegiate School of Planning (ACSP) in Philadelphia, Pennsylvania November 2 nd, 2014

Golfers in Colorado: The Role of Golf in Recreational and Tourism Lifestyles and Expenditures

Paper 2.2. Operation of Ultrasonic Flow Meters at Conditions Different Than Their Calibration

Is lung capacity affected by smoking, sport, height or gender. Table of contents

Denise L Seman City of Youngstown

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings

Using Actual Betting Percentages to Analyze Sportsbook Behavior: The Canadian and Arena Football Leagues

POWER Quantifying Correction Curve Uncertainty Through Empirical Methods

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON

ESP 178 Applied Research Methods. 2/26/16 Class Exercise: Quantitative Analysis

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

Algebra 1 Unit 6 Study Guide

Practice Test Unit 6B/11A/11B: Probability and Logic

b

WHAT CAN WE LEARN FROM COMPETITION ANALYSIS AT THE 1999 PAN PACIFIC SWIMMING CHAMPIONSHIPS?

Baseline Survey of New Zealanders' Attitudes and Behaviours towards Cycling in Urban Settings

b

Estimating Ridership of Rural Demand-Response Transit Services for the General Public

Midtown Corridor Alternatives Analysis

1999 On-Board Sacramento Regional Transit District Survey

Variables associated with odds of finishing and finish time in a 161-km ultramarathon

4/27/2016. Introduction

United States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic,

Atmospheric Waves James Cayer, Wesley Rondinelli, Kayla Schuster. Abstract

A REVIEW OF AGE ADJUSTMENT FOR MASTERS SWIMMERS

Bicycle Helmet Use Among Winnipeg Cyclists January 2012

Applying Hooke s Law to Multiple Bungee Cords. Introduction

Lab 11: Introduction to Linear Regression

Exploring Measures of Central Tendency (mean, median and mode) Exploring range as a measure of dispersion

Volume 37, Issue 3. Elite marathon runners: do East Africans utilize different strategies than the rest of the world?

Gamblers Favor Skewness, Not Risk: Further Evidence from United States Lottery Games

College Teaching Methods & Styles Journal First Quarter 2007 Volume 3, Number 1

Average Runs per inning,

Driv e accu racy. Green s in regul ation

Note that all proportions are between 0 and 1. at risk. How to construct a sentence describing a. proportion:

Effects of Incentives: Evidence from Major League Baseball. Guy Stevens April 27, 2013

Traffic Safety Barriers to Walking and Bicycling Analysis of CA Add-On Responses to the 2009 NHTS

Name May 3, 2007 Math Probability and Statistics

Aalborg Universitet. Published in: Proceedings of Offshore Wind 2007 Conference & Exhibition. Publication date: 2007

Team 1. Lars Eller vs. Montreal Canadiens. Submissions on behalf of Montreal Canadiens (Team Side)

Introduction 4/28/ th International Conference on Urban Traffic Safety April 25-28, 2016 EDMONTON, ALBERTA, CANADA

Contingent Valuation Methods

Modal Shift in the Boulder Valley 1990 to 2009

Reduction of Speed Limit at Approaches to Railway Level Crossings in WA. Main Roads WA. Presenter - Brian Kidd

Practice Test Unit 06B 11A: Probability, Permutations and Combinations. Practice Test Unit 11B: Data Analysis

Black Sea Bass Encounter

Internet Use Among Illinois Hunters: A Ten Year Comparison

An Analysis of the Effects of Long-Term Contracts on Performance in Major League Baseball

Week 7 One-way ANOVA

Stat 139 Homework 3 Solutions, Spring 2015

Cabrillo College Transportation Study

Journal of Quantitative Analysis in Sports

NCSS Statistical Software

MANITOBA'S ABORIGINAL COMMUNITY: A 2001 TO 2026 POPULATION & DEMOGRAPHIC PROFILE

Gerald D. Anderson. Education Technical Specialist

What Causes the Favorite-Longshot Bias? Further Evidence from Tennis

Atmospheric Rossby Waves in Fall 2011: Analysis of Zonal Wind Speed and 500hPa Heights in the Northern and Southern Hemispheres

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?

Standardized catch rates of yellowtail snapper ( Ocyurus chrysurus

CHAPTER 10 TOTAL RECREATIONAL FISHING DAMAGES AND CONCLUSIONS

How are the values related to each other? Are there values that are General Education Statistics

8th Grade. Data.

Transcription:

Electronic Supplementary Material: Goals as Reference Points in Marathon Running: A Novel Test of Reference Dependence Alex Markle George Wu Rebecca White Aaron Sackett This document provides additional detail and analyses for Goals as Reference Points in Marathon Running: A Novel Test of Reference Dependence (2017), Journal of Risk and Uncertainty 55(3). A.1 Additional Details on Methodology We solicited participants through marathon organizers, organized running groups, marathon training programs, message boards, and athletic shops. The study was advertised as a study on the relationship between marathon performance and satisfaction. Over the course of three years (2007-2009), we recruited prospective marathon runners for our targeted marathons, specifically Boston, 2008; Chicago, 2007-2009; Grandma s, 2008; Los Angeles, 2008; Marine Corps, 2007-2009; New York City, 2009; Portland (OR), 2007; Rock n Roll (San Diego), 2008; and Twin Cities, 2007-2009. 1 Each of these marathons was one of the 15 largest U.S. marathons of its year, ranging from 6,875 finishers for the 2008 Grandma s Marathon to 38,557 finishers for the 2007 New York City Marathon. 2 The number of total finishers and number of survey participants for each marathon is given in Table A.1. Participants registered online for our study by providing their name, email address, and the marathon that they were planning to run. They were then randomly assigned to one of 6 conditions in a 3 (Premarathon survey: goal-not-asked vs. early goal-asked vs. late goal-asked) 2 (post-marathon survey: early vs. late) design. Participants were entered into a random drawing for prizes. The prizes included one grand prize (a prize worth approximately $300 USD, such as a GPS watch, an ipod, or a Bose SoundDock), three second prizes (a prize worth approximately $100 USD such as a running jacket, an ipod Shuffle, or a heart rate monitor watch), and 10 third prizes (a prize worth approximately $10 USD such as running socks, running gloves, or a winter hat). A unique drawing was conducted for each marathon. The number of participants in each condition is presented in Table A.2. Tables A.3 and A.4 contain the wording and order of all of pre- and post-marathon survey questions. The Wharton School, University of Pennsylvania, 3620 Locust Walk, Philadelphia, PA 19104, (amarkle1@wharton.upenn.edu) University of Chicago, Booth School of Business, 5807 S. Woodlawn Avenue, Chicago, IL 60637 (wu@chicagobooth.edu) University of Chicago, Booth School of Business, 5807 S. Woodlawn Avenue, Chicago, IL 60637 (rebecca.white@chicagobooth.edu) University of St. Thomas, Opus College of Business, Mail MCH 316, 2115 Summit Ave., St. Paul, MN 55105 (sackett@stthomas.edu) 1 We also recruited marathoners from the 2007 Honolulu, 2007 Philadelphia, and 2008 San Francisco Marathons. Because these three marathons yielded only 65 participants total (26 of which provided usable data), we dropped these participants from our analyses. Including these participants does not meaningfully change the results presented in the main text. For example, Ĵ in Table 1 of the main text changes from 0.967 (χ 2 (1) = 24.34, p <.001) to 0.966 (χ 2 (1) = 24.60, p <.001). 2 Marathon statistics were obtained from http://marathonguide.com. 1

All Finishers Study Sample Total Average Starting Complete Average Finishers Finishing Time Female Participants Participants Finishing Time Female Chicago 2007 25,523 293.41 39.9% 304 168 294.87 63.1% Marine Corps 2007 20,625 281.20 39.0% 276 203 283.15 53.7% New York City 2007 38,623 268.79 32.5% 51 40 241.30 37.5% Portland 2007 7,738 301.02 54.3% 36 23 294.03 60.9% Twin Cities 2007 7,154 283.24 39.2% 268 161 278.93 57.8% Boston 2008 21,963 231.68 40.7% 54 44 221.22 40.9% Grandma s 2008 6,876 269.99 36.8% 27 22 243.80 45.5% Los Angeles 2008 17,246 339.97 38.4% 86 67 325.34 71.6% Rock n Roll (San Diego) 2008 16,760 301.35 51.5% 123 87 294.33 71.3% Chicago 2008 31,344 286.53 43.6% 241 134 280.68 70.1% Marine Corps 2008 18,237 280.56 39.2% 182 102 267.32 31.4% Twin Cities 2008 7,979 257.43 39.9% 173 107 253.22 52.3% Chicago 2009 33,703 267.93 43.4% 276 218 262.62 64.7% Marine Corps 2009 21,405 281.23 39.6% 440 332 275.87 51.5% Twin Cities 2009 8,475 255.88 42.0% 115 93 253.78 67.7% Overall 283,651 279.41 40.6% 2,652 1,801 274.99 57.3% Table A.1: Basic statistics for our study sample for each of the 15 targeted marathons, as well as for all finishers in these marathons. These numbers are taken from official results posted by the marathon organizers and sometimes differ from the numbers reported on some online compiled lists. All Participants (n = 2,652) Usable sample (n = 1,801) Post-marathon condition Post-marathon condition Pre-marathon condition Early Late Pre-marathon condition Early Late Goal-not-asked 754 319 Goal-not-asked 522 224 Early Goal-asked 245 286 Goal-not-asked 155 176 Late Goal-asked 722 326 Late Goal-asked 506 218 Table A.2: Number of participants across pre- and post-marathon conditions. The usable sample drops participants who did not complete the entire study, did not start or finish the marathon they entered, whose data could not be matched to official marathon results, or who had participated as a subject in an earlier marathon in our study. 2

Pre-Marathon Survey Survey Item Response format Condition Goal-asked Goal-not-asked Please provide us with the following background information: Last Name text entry First Name (as you registered for the text entry marathon) Age number entry E-mail text entry Confirm E-mail text entry City text entry State text entry Country text entry Gender male/female Which marathon are you running? drop-down menu Are you registered for that marathon? yes/no Please respond to the following questions about your goals for the marathon that you will be running: Why are you running this marathon? text entry (large box) How important are the following objectives [item order randomized for you in running this marathon? for each participant] Running with friends or with a club 1-7 Likert ( very Improving on your personal best time unimportant to very Placing well relative to other runners important ) Seeing a new city or new neighborhoods Finishing the marathon Meeting the time goal you have set for yourself Raising money for a charity Participating in a major event with thousands of other runners Winning a cash prize Accomplishing a feat respected by others Demonstrating your athletic abilities to yourself Increasing self-confidence Having fun on marathon day Do you have any other objectives you hope to achieve that are not listed above? text entry (medium box) If you indicated another objective above, 1-7 Likert ( very how important is that objective to you? unimportant to very important ) Do you have a specific time goal yes/no for the marathon? Please respond to the following questions about your time goals for the marathon: [this section appeared only for participants who answered yes to the question above] My time goal for the marathon is: hours:minutes:seconds My time goal for the half-marathon is: hours:minutes:seconds Table A.3: Complete wording and order of the survey questions for the Pre-marathon survey 3

Pre-Marathon Survey (continued) Survey Item Response format Condition Goal-asked Goal-not-asked How likely is it that you will make your slider scale from 0% time goal? 100% What is your strategy for meeting your time text entry (large box) goal? How satisfied will you be with each of the outcomes below? Finishing the marathon 20 minutes slower than your stated goal Finishing the marathon 1 minute slower 1-7 Likert ( very than your stated goal unimportant to very Finishing the marathon 1 minute faster important than your stated goal Finishing the marathon 10 minutes faster than your stated goal Finishing the marathon 20 minutes faster than your stated goal Please provide us with any other comments text entry (large box) you may have about your marathon goals. Please respond to the following questions about your training for the marathon that you will be running: On average, how many days per week did forced choice (1 through you run in the last month? 7 On average, how many miles per week did number entry you run during the last month? What is your longest run (in miles) in the last number entry month? How much effort have you put into training 1-7 Likert ( very little for the upcoming marathon? effort to very large effort In general, how satisfied are you with your 1-7 Likert ( very training? unsatisfied to very satisfied ) Are you a member of a running club? yes/no (Which one?) Are you enrolled in a marathon training yes/no (Which one?) program Please respond to the following questions about your past marathon experience: For how many years have you been number entry running? How many miles per week do you typically number entry run when not training for a marathon? My best Half-Marathon time is: hours:minutes:seconds My best 10-Kilometer time is: hours:minutes:seconds On average, how many competitive races do number entry you enter per year? Would you characterize yourself more as a 1-7 Likert ( very casual serious runner, or as a casual runner? to very serious ) Have you completed a marathon before? yes/no Please respond to the following questions about your past marathon experience: [this section appeared only for participants who answered yes to the question above] How many marathons have you completed number entry before? How many marathons have you completed number entry in the last 12 months? In what year was your fastest marathon time? number entry In what year was your most recent marathon? number entry My fastest Marathon time was: hours:minutes:seconds My most recent Marathon time was: hours:minutes:seconds If you have any friends who may be interested in participating in our study, please enter their email addresses here. Friend 1 text entry Friend 2 text entry Friend 3 text entry Table A.3: (continued) Complete wording and order of the survey questions for the Pre-marathon survey 4

Post-Marathon Survey Survey Item Response format Condition Goal-asked Goal-not-asked Please provide us with the following background information: Last Name text entry First Name (as you registered for the marathon) text entry Age number entry E-mail text entry Confirm E-mail text entry City text entry State text entry Country text entry Which marathon did you run? drop-down menu Please respond to the following questions about your performance in the Marathon you ran: Which of the following is true for this marathon? I did not run in the Marathon this year. I started but did not finish the Marathon forced choice (select this year. only one) I finished the Marathon this year. If you did not run the Marathon this year, text entry (large box) please explain the reason for your decision. If you started, but did not finish the marathon: How many miles did you complete? number entry What was the primary reason that you did text entry (large box) not finish? How satisfied are you with your performance 1-7 Likert ( very in the Marathon? unsatisfied to very satisfied ) How satisfied are you with your race day 1-7 Likert ( very effort? unsatisfied to very satisfied ) How important were each of the following [item order randomized objectives in contributing to you overall for each participant] satisfaction with the Marathon run? Running with friends or with a club 1-7 Likert ( very Improving on your personal best time unimportant to very Placing well relative to other runners important ) Seeing a new city or new neighborhoods Finishing the marathon Meeting the time goal you have set for yourself Raising money for a charity Participating in a major event with thousands of other runners Winning a cash prize Accomplishing a feat respected by others Demonstrating your athletic abilities to yourself Increasing self-confidence Having fun on marathon day Did you have any other objectives that text entry (medium box) contributed to your overall satisfaction with the Marathon run? If you indicated another objective above, 1-7 Likert ( very how important was this objective to unimportant to very your overall satisfaction? important ) Did you have a specific time goal for the yes/no marathon? Table A.4: Complete wording and order of the survey questions for the Post-marathon survey 5

Post-Marathon Survey (continued) Survey Item Response format Condition Goal-asked Goal-not-asked Please respond to the following questions about your time goals for the marathon: [this section appeared only for participants who answered yes to the question above] My time goal for the Marathon was: hours:minutes:seconds Prior to the race, how likely did you think it slider scale from 0% to was that you would meet this time goal? 100% One month before race day, what clock time did you think you would actually obtain? hours:minutes:seconds My expected clock time was: Now that the marathon is over, how realistic 1-7 Likert ( very was the time goal you had set for yourself? unrealistic to very realistic ) If you were to run this Marathon again, would you: Raise your time goal forced choice (select Lower your time goal only one) Leave your time goal the same If you beat your time goal, why do you think [item order randomized was? (check all that apply) for each participant] Time goal too low check box Good weather check box Good racing strategy check box Trained well check box Had a good day check box Other check box (text entry) If you fell short of your time goal, why do you [item order randomized that was? (check all that apply) for each participant] Time goal too high check box Too many runners check box Bad weather check box Poor racing strategy check box Did not train enough check box Simply had an off day check box Physical ailment check box Other check box (text entry) If you plan on running another marathon, text entry (large box) what did you learn from this marathon that you will apply in your next marathon? Please respond to the following questions about your Marathon results My clock time for the marathon was: hours:minutes:seconds My chip time for the marathon was: hours:minutes:seconds Comments text entry (large box) Table A.4: (continued) Complete wording and order of the survey questions for the Post-marathon survey 6

A.2 Additional Details of Basic Results A.2.1 Summary Statistics of Demographic and Goal Measures Table A.5 provides summary statistics of the demographic, goal, satisfaction, and performance measures. A.2.2 Representativeness of Sample We examined the representativeness of our sample relative to the overall population of the marathons used in our study. Although we do not have any data on goals or experience for marathon runners at large, our sample seems fairly representative in other dimensions. We refer to the population as the 283,651 marathon finishers in our 15 target marathons and the sample as the 1,801 runners described above. We created weighted averages by weighting the relevant statistics by the proportion of our sample in each marathon. For example, 168 (or 9.3%) of our 1,801 participants ran the 2007 Chicago Marathon. To compute the weighted average finishing time, we multiplied the finishing time for all runners in the 2007 Chicago Marathon (294.87 minutes) by 9.3%, repeated this process for the other 14 marathons, and summed the 15 products. Our sample ran slightly faster than the population of marathoners. The mean finishing time in our sample was 274.99 minutes (Median: 272.35 minutes), while the weighted population mean was 279.41 minutes (Median: 276.35 minutes). The interquartile range for our sample, [237.17, 309.87], was similar to the interquartile range for the population, [240.70, 315.25]. Similarly, the 10th and 90th percentile times for our sample were 209.53 and 342.88 minutes, respectively, whereas the same percentiles for the population were 215.22 and 352.80 minutes, respectively. The median finishing times across all United States marathon finishers (including marathons not in our sample) in the three years of our study were: 281.55 minutes in 2007, 278.92 minutes in 2008, and 275.70 minutes in 2009. 3 Cumulative distributions of the sample as well as the population (as determined by the weighted procedure above) is given in Figure A.1. 1.0 Population Sample 0.8 Proportion of runners 0.6 0.4 0.2 0.0 2:00 2:30 3:00 3:30 4:00 4:30 5:00 5:30 6:00 6:30 7:00 7:30 8:00 Finishing time Figure A.1: Cumulative distributions of finishing times for the population of marathon runners (solid line) as well as the participants in our sample (dashed line). The mean age of our runners was 37.36, which was almost identical to the mean age of runners in the marathon population, 37.21. The one dimension in which our sample was not representative is gender. Consistent with the finding that women are more likely to complete surveys than men (Curtin, Presser, and Singer, 2000), 57.3% of our participants were female, compared to 40.6% of runners in the marathon population. 3 Marathon statistics were obtained from http://marathonguide.com. 7

Type of measure Measure Average Median Std. Dev Minimum Maximum Demographics Age 37.08 35 10.1 18 72 Female 57.30% Years running 10.16 7 9.31 0 50 Running Background Previously run marathon 71.60% Number of completed marathons 5.56 2 15.65 0 395 Seriousness of running 4.39 5 1.35 1 7 Best 10 kilometers 0:50:38 0:49:56 10:12 30:31 2:01:29 Best half marathon 1:59:30 1:55:00 27:24 1:06:10 5:26:47 Best marathon 4:08:45 4:01:58 50: 5 2:16:36 8:17:42 Last marathon 4:30:14 4:25:00 52:46 2:16:26 8:21:59 Days per week 4.32 4 1.12 1 7 Training Miles per week 35.48 35 15.62 3 200 Long run (miles) 19.95 20 3.55 3 62 Goals Have goal 86.06% Goal 4:10:50 4:00:00 42:03 2:19:59 8:10:00 Likelihood of reaching goal 74.01% 80.00% 18.07% 0.00% 100.00% Pre-marathon Objectives Demonstrate self-confidence 5 5 1.71 1 7 Meet time goal 5.66 6 1.31 1 7 Demonstrate athletic ability 5.81 6 1.41 1 7 Achieve feat 5 5 1.7 1 7 See city 3.08 3 1.9 1 7 Have fun 5.89 6 1.32 1 7 Participate in a major event 4.38 5 1.72 1 7 Run with friends/family 3.93 4 2.11 1 7 Win cash 1.24 1 0.84 1 7 Raise money 2.41 2 1.88 1 7 Beat time goal 5.34 6 1.63 1 7 Beat best time 4.6 5 2.16 1 7 Place high 3.92 4 1.79 1 7 Finish 6.64 7 1.04 1 7 Post-marathon Objectives Meet time goal 5.80 6 1.27 1 7 Performance Finishing time 4:34:59 4:32:21 52:11 2:17:21 8:25:26 Satisfaction with performance 4.88 5 1.67 1 7 Satisfaction with effort 5.56 6 1.36 1 7 Finishing time minus goal 0:18:17 0:11:51 25:23 2:43:0 2-1:00:10 Relative finishing time minus goal 73.66% 73.66% 73.66% 73.66% 73.66% Table A.5: Descriptive statistics on demographics, goals, running background, training, objectives, satisfaction, and performance. Note: A random number has been added to the minimum and maximum for best 10 kilometers, best half marathon, best marathon, last marathon, and finishing time so that participants in our study cannot be identified. The random number is uniformly distributed between -3% of the actual time and +3% of the actual time. Most of the objective importance ratings reflect pre-marathon responses and thus only include participants in the goal-asked condition. We include pre- and post-marathon measures of goal importance, because we use post-marathon measures of goal importance in our moderation analyses in A.3.9. The differences between pre- and post-marathon measures reflect in large part higher post-marathon ratings for goal-not-asked participants (t(1, 540) = 2.08, p =.038). 8

A.2.3 Goal Importance Figure A.2 presents a histogram of participants rating of goal importance ( Meeting the time goal you have set for yourself ), reported post-marathon (M=5.80) on a 1 ( very unimportant ) to 7 ( very important ) scale. 40% 30% Frequency 20% 10% 0% 1 2 3 4 5 6 7 Goal importance Figure A.2: Histogram of post-marathon reported goal importance on a 1-7 Likert scale. A.2.4 Runner Experience Figure A.3 presents a histogram of participants previous marathon experience in terms of self-reported number of marathons completed. 30% 20% Frequency 10% 0% 0 1 2 3 4 5 6 7 8 9 10 >10 Runner experience (Number of completed marathons) Figure A.3: Histogram of runner experience by number of previous marathons completed. 9

A.2.5 Individual Marathon Statistics Table A.6 shows, for each of the 15 marathons, the percentage of marathoners who met their time goal ( goal success rate ), relative performance (as measured by time goal minus finishing time), normalized relative performance (as measured by time goal minus finishing time divided by time goal), and the high, average, and low temperature in the marathon site on the day of that marathon. 4 Performance Temperature (in degrees F) Goal Success Relative Normalized Rate Performance Relative Performance (percent) (minutes) (percent) High Average Low Chicago 2007 4.0% -50.75-21.2% 93 80 70 Marine Corps 2007 31.3% -12.92-4.9% 62 57 52 New York City 2007 18.2% -9.55-4.1% 56 50 45 Portland 2007 25.0% -15.01-5.5% 59 50 43 Twin Cities 2007 7.5% -31.89-13.8% 84 78 73 Boston 2008 31.6% -5.98-2.7% 55 48 44 Grandma s 2008 21.1% -12.99-5.1% 79 66 55 Los Angeles 2008 12.8% -15.23-5.2% 63 54 46 Rock n Roll (San Diego) 2008 18.9% -20.94-7.7% 66 62 61 Chicago 2008 7.7% -28.78-11.4% 86 69 57 Marine Corps 2008 27.6% -13.23-5.5% 63 54 45 Twin Cities 2008 44.1% -7.86-3.4% 61 54 48 Chicago 2009 44.0% -6.54-2.6% 43 36 28 Marine Corps 2009 27.9% -13.11-5.1% 72 58 49 Twin Cities 2009 45.8% -5.61-2.2% 55 50 46 All Marathons 25.3% -18.28-7.40% 68 59 51 Table A.6: Performance and weather statistics for each of 15 marathons. The correlations between high temperature and: (i) goal success rate (r = 0.78); (ii) relative performance (r = 0.85); and (iii) normalized relative performance (r = 0.85). A.3 Robustness Analyses In this section, we present additional analyses to test the robustness of the findings reported in the main paper. In Section A.3.1, we fit higher-order piecewise polynomials and test whether our reference dependence findings are sensitive to model selection. In Section A.3.2, we test alternative locations for the knot that ties the two pieces of the polynomials together. These analyses test whether the reference point should be located at zero as we have assumed in the main text. In Section A.3.3, we refit the linear, quadratic, and cubic polynomials to relative performance (in minutes) instead of normalized relative performance as used in the main text. In Section A.3.4, we remove influential outliers and refit the models. We also remove all observations from one marathon, the 2007 Chicago Marathon, that had distinctly different characteristics from the other marathons. In Section A.3.5, we repeat the multiple reference point analysis from the main paper, using runners most recent marathon time, rather than their best marathon time, as a second reference point. In Section A.3.6, we elaborate on the change in time goals elicited before and after the marathon for participants in the goal-asked conditions. We also examine whether reference dependence holds for pre-marathon goals. In Section A.3.7, we test for any impact of having elicited a goal prior to the marathon on reference dependence. In Section A.3.8, we test whether the timing of satisfaction elicitation following the marathon had any effect on our estimates of reference dependence. In Section A.3.9, we reproduce the moderation analysis from the main paper, which examines the effect of goal importance and runner experience on loss aversion, using alternative dichotomizations of experience. Finally, in Section A.3.10, we present alternative analyses of participants predicted satisfaction. A.3.1 Higher Order Models In the main paper, we fit a piecewise quadratic polynomial to our data using ordered logit regression. We established that the quadratic model exhibits reference dependence in the form of a jump at the reference point, a steeper slope in the loss than in the gain domain, and diminishing sensitivity in gains and losses. Here we additionally fit all combinations of linear, quadratic, cubic, quartic, and quintic polynomials in both 4 Weather information for each marathon was obtained from http://www.wunderground.com/history/. 10

gains and losses, yielding 25 models. We then test each model for a jump at the reference point (Table A.3.1), a difference in loss and gain slopes at x = 3% (Table A.8) and x = 5% (Table A.9), and a difference in loss and gain levels at x = 3% (Table A.10) and x = 5% (Table A.11). Finally we also test for diminishing sensitivity adjacent to the reference point in losses (Table A.12) and gains (Table A.13). The shaded boxes represent results that are significant at the 0.05 level. The basic findings from the main paper, a discontinuity at the reference point, a difference in the magnitude of gain and loss slopes away from the reference point, and diminishing sensitivity are broadly robust to alternative specifications of the piecewise polynomial. Notably, evidence for reference dependence is weakest for models that fit higher-order polynomials in gains. Our sample is highly skewed, with an interquartile range of normalized relative performance of -12.3% to 0%. Consequently, while higher-order polynomials are well-behaved when fit to the loss domain, they are overfit and overly influenced by outliers in the gain domain. This overfitting in gains is confirmed by model validation exercises. Furthermore, this analysis illustrates the difficulty in statistically discriminating between the jump at the reference point and a more pronounced difference in slopes across the reference point. Moving down the rows in Table A.8 and Table A.9 (i.e., going from a lower-ordered to higher-ordered polynomial fit to performance in the loss domain), we observe an increasing difference in loss and gain slopes. Moving down the rows in Table A.3.1, we observe a diminishing magnitude in the size of the jump at the reference point. Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic 1.54 1.30 1.24 1.00 0.73 p <.001 p <.001 p <.001 p <.001 p = 0.013 1.21 0.97 0.91 0.65 0.37 p <.001 p <.001 p <.001 p = 0.015 p = 0.215 1.01 0.77 0.71 0.44 0.17 p <.001 p <.001 p = 0.002 p = 0.103 p = 0.589 0.81 0.57 0.51 0.24-0.04 p <.001 p = 0.009 p = 0.040 p = 0.400 p = 0.895 0.67 0.43 0.36 0.09-0.19 p = 0.002 p = 0.067 p = 0.162 p = 0.753 p = 0.564 Table A.7: The coefficient on the jump at the reference point (Ĵ) and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. Shaded boxes indicate significance at the 0.05 level or better. Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic 4.23-3.80-3.97 5.58 20.75 p = 0.330 p = 0.455 p = 0.436 p = 0.456 p = 0.061 11.28 3.22 3.02 13.04 28.46 p = 0.012 p = 0.536 p = 0.562 p = 0.085 p = 0.011 15.85 7.77 7.57 17.77 33.38 p <.001 p = 0.150 p = 0.161 p = 0.021 p = 0.003 20.79 12.70 12.50 22.84 38.56 p <.001 p = 0.026 p = 0.029 p = 0.004 p <.001 23.97 15.88 15.68 26.04 41.83 p <.001 p = 0.008 p = 0.009 p = 0.001 p <.001 Table A.8: Loss slopes (S ( x)) minus gain slopes (S (x)) at x = 3% and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. The p-values are obtained by Wald tests that examine the null hypothesis of equality of the loss and gain slopes. Shaded boxes indicate significance at the 0.05 level or better. 11

Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic 4.23 3.49 6.08 19.25 12.73 p = 0.330 p = 0.382 p = 0.351 p = 0.053 p = 0.232 10.50 9.70 12.53 26.39 19.84 p = 0.018 p = 0.018 p = 0.057 p = 0.008 p = 0.064 13.92 13.11 15.98 30.08 23.49 p = 0.002 p = 0.002 p = 0.017 p = 0.003 p = 0.029 16.56 15.75 18.68 32.95 26.35 p <.001 p <.001 p = 0.006 p = 0.001 p = 0.015 17.16 16.36 19.31 33.60 26.99 p <.001 p <.001 p = 0.004 p <.001 p = 0.013 Table A.9: Loss slopes (S ( x)) minus gain slopes (S (x)) at x = 5% and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. The p-values are obtained by Wald tests that examine the null hypothesis of equality of the loss and gain slopes. Shaded boxes indicate significance at the 0.05 level or better. Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic 1.67 1.02 0.87 0.27-0.22 p <.001 p = 0.003 p = 0.059 p = 0.631 p = 0.730 1.56 0.92 0.75 0.12-0.38 p <.001 p = 0.008 p = 0.105 p = 0.833 p = 0.546 1.53 0.89 0.71 0.08-0.43 p <.001 p = 0.011 p = 0.123 p = 0.895 p = 0.493 1.54 0.89 0.72 0.07-0.44 p <.001 p = 0.010 p = 0.120 p = 0.897 p = 0.489 1.58 0.93 0.75 0.11-0.41 p <.001 p = 0.008 p = 0.104 p = 0.853 p = 0.520 Table A.10: Loss levels (S(0) S( x)) minus gain levels (S(x) S(0)) at x = 3% and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. The p-values are obtained by Wald tests that examine the null hypothesis of equality of the loss and gain levels. Shaded boxes indicate significance at the 0.05 level or better. Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic 1.75 1.02 0.89 0.56 0.19 p <.001 p = 0.016 p = 0.069 p = 0.287 p = 0.739 1.78 1.05 0.91 0.55 0.17 p <.001 p = 0.013 p = 0.066 p = 0.294 p = 0.759 1.83 1.09 0.95 0.59 0.21 p <.001 p = 0.010 p = 0.054 p = 0.262 p = 0.715 1.92 1.18 1.03 0.67 0.28 p <.001 p = 0.006 p = 0.037 p = 0.206 p = 0.619 1.99 1.25 1.10 0.74 0.35 p <.001 p = 0.003 p = 0.026 p = 0.164 p = 0.539 Table A.11: Loss levels (S(0) S( x)) minus gain levels (S(x) S(0)) at x = 5% and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. The p-values are obtained by Wald tests that examine the null hypothesis of equality of the loss and gain levels. Shaded boxes indicate significance at the 0.05 level or better. 12

Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic NA NA NA NA NA 19.54 19.53 19.54 19.61 19.63 p <.001 p <.001 p <.001 p <.001 p <.001 52.66 52.65 52.67 52.91 53.05 p <.001 p <.001 p <.001 p <.001 p <.001 135.46 135.67 135.76 136.40 136.77 p <.001 p <.001 p <.001 p <.001 p <.001 261.40 262.50 262.68 263.51 264.33 p <.001 p <.001 p <.001 p <.001 p <.001 Table A.12: Diminishing sensitivity in losses near the origin (as measured by ˆn 2 ) and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. Shaded boxes indicate significance at the 0.05 level or better. Loss domain Linear Quadratic Cubic Quartic Quintic Gain domain Linear Quadratic Cubic Quartic Quintic NA -182.13-318.00-1849.59-5638.25 p = 0.002 p = 0.252 p = 0.047 p = 0.012 NA -181.62-330.50-1938.79-5788.78 p = 0.002 p = 0.234 p = 0.038 p = 0.010 NA -181.68-332.57-1967.66-5861.95 p = 0.002 p = 0.232 p = 0.035 p = 0.010 NA -182.12-336.01-1989.23-5907.55 p = 0.002 p = 0.228 p = 0.034 p = 0.009 NA -182.70-337.91-1994.12-5927.60 p = 0.002 p = 0.225 p = 0.033 p = 0.009 Table A.13: Diminishing sensitivity in gains near the origin (as measured by ˆp 2 ) and p-values for all combinations of linear, quadratic, cubic, quartic, and quintic models in gains and losses. Shaded boxes indicate significance at the 0.05 level or better. 13

A.3.2 Location of the Reference Point In the main paper, we assumed that the reference point lies at zero (i.e., finishing time equals time goal) and tested for reference dependence around that point. Here we test whether that assumption is appropriate by fitting the piecewise linear, quadratic, and cubic polynomials with knot points ranging from -10% to 5% to identify the knot location which provides the best fit to the data for each model. The left panels of Figure A.4 display the log likelihood for each model as a function of knot placement. For the linear and quadratic models, a local maximum is located near the assumed reference point, but the global maximum in each case is far in the loss domain. A likely explanation for this pattern is our highly skewed data, with far more observations below than above the reference point. Thus a piecewise polynomial with a knot closer to the center of the data is likely to provide a better fit. For example, 10% of our participants fall short of their goal by more than 20.8%. As a result, we again determine the best-fitting knot, limiting the range of normalized relative performance to [-15%,15%] (1,254 or 81.7% of our 1,534 participants with goals and satisfaction measures). The results for this restricted sample are displayed in the right panel of Figure A.4. For all models, the global maxima for the restricted sample is located at approximately 0.3%, very close to zero and in support of our choice of the time goal as a reference point. -2350-2360 -2370-2380 -2390-2400 -2410-1830 -1835-1840 -1845-2330 Log likelihood -2335-2340 -2345-2350 -2355-1825 -1830-1835 -1822-2330 -2335-1824 -1826-1828 -2340-10% -5% 0% 5% -1830-1832 -10% -5% 0% 5% Normalized relative performance Figure A.4: Log likelihood for the linear (top panels), quadratic (middle panels) and cubic (bottom panel) models as a function of knot placement. The left panels present log likelihoods for the models fit to the full range of data, while the right panels present log likelihoods for the models fit to a range of normalized relative performance of -15% to 15%. 14

A.3.3 Absolute Performance In the main paper, we estimated models of satisfaction as a function of normalized relative performance, the difference between a participant s running time and their time goal, divided by their time goal. This was done to make performance more readily comparable across runners with wide ranging goals and performance. Here we fit the same models using an absolute measure of relative performance (see Figure 1 in the main text). Recall that relative performance is the difference between the time goal and the finishing time. In this analysis, we scale relative performance to be a fraction of one hour to make the coefficients somewhat more comparable to those estimated in the main text. The parameter estimates are presented in Table A.14. The results using non-normalized relative performance measure are qualitatively similar to those we presented in the main paper using a normalized measure. We find evidence of loss aversion, as presented in Table A.15. Diminishing sensitivity is also significant both in the loss (χ 2 (1) = 35.76, p <.001) and gain (χ 2 (1) = 9.12, p =.002) domains. Linear Quadratic Predictor ˆβ losses (ˆn 1 ) 4.193*** gains (ˆp 1 ) 4.520*** losses (ˆn 2 ) 1.235*** gains (ˆp 2 ) 7.299*** jump (Ĵ) 1.089*** Observations 1534 Degrees of freedom 20-2 log likelihood 4698.85 Table A.14: Ordered logit regression on piecewise quadratic polynomial relating satisfaction to relative performance in fractions of an hour rather than normalized relative performance. The statistical significance of coefficients is indicated by: *** (p <.01), ** (p <.05), and * (p <.10). Slopes Levels x loss gain p-value loss gain p-value 0 4.19 4.52 0.843 1.09 0.00 0.000 1/60 4.15 4.28 0.937 1.16 0.07 0.000 2/60 4.11 4.03 0.959 1.23 0.14 0.000 3/60 4.07 3.79 0.847 1.30 0.21 0.000 4/60 4.03 3.55 0.727 1.36 0.27 0.000 5/60 3.99 3.30 0.604 1.43 0.33 0.000 6/60 3.95 3.06 0.481 1.50 0.38 0.000 7/60 3.90 2.82 0.364 1.56 0.43 0.000 8/60 3.86 2.57 0.259 1.63 0.47 0.000 9/60 3.82 2.33 0.171 1.69 0.51 0.001 10/60 3.78 2.09 0.104 1.75 0.55 0.001 11/60 3.74 1.84 0.057 1.82 0.58 0.001 12/60 3.70 1.60 0.028 1.88 0.61 0.001 13/60 3.66 1.36 0.012 1.94 0.64 0.001 Table A.15: Tests of loss aversion using relative performance, as defined as the difference between time goal and finishing time in fractions of an hour. Loss and gain slopes (S ( x) and S (x)) and loss and gain levels (S(0) S( x) and S(x) S(0)) at different values of x. The p-values are obtained by Wald tests that examine the null hypothesis of equality of the loss and gain slopes and the loss and gain levels. 15

A.3.4 Outlier Removal We next identify and drop influential outliers from our data and refit the models to the reduced data set. DFBETAS (standardized difference of the beta) measures the degree of influence of a single observation on the coefficients of the fitted model, where the larger the value of DFBETAS, the greater the influence of the observation on a particular coefficient. 5 We identify outliers for each of our models and refit those models to a reduced data set with the outliers removed. Table A.16 presents the parameter estimates for all of these refit models. The basic results are qualitatively similar to those from the models fit to the full data set. We find evidence for loss aversion (see Table A.17). Diminishing sensitivity is also significant in both losses (χ 2 (1) = 42.86, p <.001) and gains (χ 2 (1) = 9.22, p =.002). Results for other cutoff levels for eliminating outliers are similar. Linear Quadratic Predictor ˆβ losses (ˆn 1 ) 21.74*** gains (ˆp 1 ) 24.31*** losses (ˆn 2 ) 31.92*** gains (ˆp 2 ) 180.41*** jump (Ĵ) 0.82*** Outliers removed 28 Degrees of freedom 20-2 log likelihood 4549.77 Table A.16: Ordered logit regression on piecewise polynomials relating satisfaction to normalized relative performance, refit to reduced datasets with outliers removed using DFBETAS. The statistical significance of coefficients is indicated by: *** (p <.01), ** (p <.05), and * (p <.10). Slopes Levels x loss gain p-value loss gain p-value 0.0% 21.74 24.31 0.750 0.82 0.00 0.000 0.5% 21.42 22.51 0.886 0.93 0.12 0.000 1.0% 21.11 20.70 0.954 1.04 0.23 0.001 1.5% 20.79 18.90 0.774 1.14 0.32 0.003 2.0% 20.47 17.09 0.582 1.25 0.41 0.005 2.5% 20.15 15.29 0.393 1.35 0.49 0.008 3.0% 19.83 13.49 0.231 1.45 0.57 0.011 3.5% 19.51 11.68 0.112 1.55 0.63 0.013 4.0% 19.19 9.88 0.044 1.64 0.68 0.014 4.5% 18.87 8.07 0.013 1.74 0.73 0.013 5.0% 18.55 6.27 0.003 1.83 0.76 0.012 Table A.17: Test of loss aversion using normalized relative performance, with outliers removed using DFBE- TAS. Loss and gain slopes (S ( x) and S (x)) and loss and gain levels (S(0) S( x) and S(x) S(0)) at different values of x. The p-value is obtained by a Wald test that examines the null hypothesis of equality of the loss and gain slopes and the loss and gain levels. We additionally drop participants for one of our 15 marathons, the 2007 Chicago Marathon, and refit our models to the remaining observations. This marathon took place on a day of record heat (reaching a peak of 93 degrees, the hottest October 7th on record, compared with a daily average high of 66 degrees for that day), causing hundreds of runners to fall ill or drop out, and prompting the organizers to stop the race 3.5 hours after it began. 6 Of those who finished the race before it was stopped, a significantly smaller proportion 5 Specifically, the DFBETAS for an observation is the difference between a regression coefficient for a particular variable calculated when the model is fit to the entire data set including that observation, and that when the model is fit to a data set with that observation removed, scaled by the standard error from the second fitting. We use a standard cutoff for identifying influential outliers of 2 (p + 1)/(n p 1), where p is the number of parameters excluding intercepts and n is the number of observations (Harrell 2010). 6 http://www.nytimes.com/2007/10/08/us/08chicago.html. Referenced on January 5, 2018. 16

met their time goal relative to other runners in our sample (M = 4.0% versus M = 27.7%, χ 2 (1) = 38.83, p <.001), and runners in that marathon also finished the marathon further behind their goal than other runners (M = 50.75 minutes versus M = 14.80 minutes, t(165.76) = 13.84, p <.001). The results of our model after dropping observations from the 2007 Chicago Marathon are presented in Table A.18. Again, the basic results hold, showing evidence of loss aversion (see Table A.19). Diminishing sensitivity is also significant both in the loss (χ 2 (1) = 23.23, p <.001) and gain (χ 2 (1) = 9.24, p <.01) domains. Linear Quadratic Predictor ˆβ losses (ˆn 1 ) 19.73*** gains (ˆp 1 ) 26.55*** losses (ˆn 2 ) 22.66*** gains (ˆp 2 ) 182.67*** jump (Ĵ) 0.92*** Observations 1384 Degrees of freedom 19-2 log likelihood 4106.42 Table A.18: Ordered logit regression on piecewise quadratic polynomial relating satisfaction to normalized relative performance, refit to reduced datasets with observations from the 2007 Chicago Marathon removed. The statistical significance of coefficients is indicated by: *** (p <.01), ** (p <.05), and * (p <.10). Slopes Levels x loss gain p-value loss gain p-value 0.0% 19.73 26.55 0.403 0.92 0.00 0.000 0.5% 19.50 24.72 0.495 1.02 0.13 0.000 1.0% 19.28 22.89 0.612 1.12 0.25 0.001 1.5% 19.05 21.07 0.762 1.21 0.36 0.002 2.0% 18.82 19.24 0.946 1.31 0.46 0.005 2.5% 18.60 17.41 0.838 1.40 0.55 0.009 3.0% 18.37 15.59 0.604 1.49 0.63 0.014 3.5% 18.14 13.76 0.381 1.59 0.71 0.018 4.0% 17.92 11.93 0.202 1.68 0.77 0.021 4.5% 17.69 10.11 0.087 1.76 0.82 0.023 5.0% 17.46 8.28 0.031 1.85 0.87 0.022 Table A.19: Tests of loss aversion using normalized relative performance, dropping results of 2007 Chicago Marathon. Loss and gain slopes (S ( x) and S (x)) and loss and gain levels (S(0) S( x) and S(x) S(0)) at different values of x. The p-value is obtained by a Wald test that examines the null hypothesis of equality of the loss and gain slopes and the loss and gain levels. 17

A.3.5 Multiple Reference Points In the main text, we estimated a multiple reference point model, incorporating both a runner s time goal and their best previous marathon time as reference points. Here we perform the same analysis using a runner s time goal and their most recent marathon time as reference points. Table A.20 shows the estimates of this model, and Figure A.5 displays the separate marginal effects of performance relative to the time goal and performance relative to most recent marathon time. These findings are nearly identical to those from the multiple reference point model presented in the main paper. The dual reference point model provides a significantly better fit than the model incorporating only the time goal as a reference point (likelihood-ratio test, χ 2 (5) = 261.61, p <.001), and as before this comparison mostly impacts overall satisfaction by producing an additional jump when performance exceeds a runner s most recent marathon time. Linear Quadratic Time Goal Last Time losses (ˆn 1g) 18.01*** (ˆn 1l ) 2.49 gains (ˆp 1g) 32.58** (ˆp 1l ) 1.78 losses (ˆn 2g) 18.97*** (ˆn 2l ) 2.34 gains (ˆp 2g) 303.62 (ˆp 2l ) 0.73 Jump (Ĵg) 0.81*** (Ĵl) 0.61*** Observations 1015 Degrees of freedom 25-2 log likelihood 3071.98 Table A.20: Multiple reference point analysis. Ordered logit regression on quadratic polynomial relating satisfaction to normalized relative performance, defined relative to both time goal and a runner s most recent marathon time. The statistical significance of coefficients is indicated by: *** (p <.01), ** (p <.05), and * (p <.10). 18

4 8 3 7 Log odds 2 1 0 Expected satisfaction 6 5 4 3-1 2-2 1-20% -15% -10% -5% 0% 5% Normalized relative performance -20% -15% -10% -5% 0% 5% Figure A.5: Marginal effects of normalized performance relative to time goal (solid), and relative to last marathon time (dashed), on satisfaction. The left panel plots the log odds of meeting or exceeding the midpoint of our satisfaction scale (4), while the right panel presents expected satisfaction on the original ordinal (1-7) scale. For each marginal effect plot, performance relative to the other reference point was held at zero. A.3.6 Analyzing Goal Change Recall that we elicited time goals from participants in the goal-asked conditions (n = 1,055) both prior to and after the marathon. 85.4% of participants in the goal-asked conditions had time goals before the marathon. This percentage increased insignificantly to 86.1% after the marathon (t(1054) = 0.266, p = 0.790). Overall, runners reported less ambitious goals after the marathon (M = 246.50) than before it (M = 245.23) (t(835) = 3.30, p = 0.001). 30.1% (19.7%) of participants reported less (more) optimistic goals in the post-marathon survey than in the pre-marathon survey. 50.1% of participants provided identical goals in the two surveys. Since a minority of runners achieved their time goal, reporting less ambitious goals after the marathon closes the average gap between goals and performance. Thus, we examine how goals change as a function of performance relative to the pre-marathon time goal. Overall, for participants who provide both preand post-marathon goals, 21.7% of these runners met their pre-marathon goals and 24.5% of these runners met their post marathon goals. This difference is significantly different (t(835) = 2.80, p =.005). In addition, 39.0% of runners who fell short of their pre-marathon goal increased their reported goal following the marathon, compared to 6.0% of the runners who bested their pre-marathon goal (χ 2 (1, 836) = 25.44, p < 0.001). Of course, such a shift could occur for non-psychological reasons (such as a change in the weather as happened with the 2007 Chicago Marathon or an injury onset) or for psychological reasons (such as the desire to self-enhance). Nevertheless, whatever the reason, the mean change is still relative small (M = 1.27). We fit the quadratic model using the pre-marathon goals and refit the same model to the post-marathon goals for the subset of participants who provided both pre- and post-marathon goals. The parameter estimates for both models are presented in Table A.21. Overall, the model fit to post-marathon goals provides a better fit to the data as measured by log likelihood. More importantly, we largely replicate our results for both pre- and post-marathon goals. For pre-marathon goals, the loss level significantly exceeds the gain level at p =.05 for values of x from 0-5% for the model fit to pre-marathon goals (e.g., χ 2 (1) = 5.37, p =.021 19

at x = 5%). For post-marathon goals, the loss level exceeds the gain level at p <.05 for values of x from 0-3.1%, and at p <.10 for values up to 5% (e.g., χ 2 (1) = 3.71, p =.054 at x = 5%). Diminishing sensitivity in losses is significant for both the pre-marathon goal model, (χ 2 (1) = 31.37, p <.001), and post-marathon goal model, (χ 2 (1) = 24.13, p <.001). Diminishing sensitivity in gains, however, is not significant for either pre-marathon goals, (χ 2 (1) = 0.08, p = 0.778), or post-marathon goals, (χ 2 (1) = 2.54, p = 0.113). These weakened results could likely be attributed to the reduction in sample size relative to the main analysis (n = 835 vs. n = 1,534). Linear Quadratic Pre-Marathon Goal Post-Marathon Goal losses (ˆn 1 ) 13.93*** 15.89*** gains (ˆp 1 ) 8.69 23.68* losses (ˆn 2 ) 12.62*** 16.78*** gains (ˆp 2 ) 18.26 205.06 jump (Ĵ) 1.13*** 1.08*** Observations 835 835 Degrees of freedom 20 20-2 log likelihood 2617.68 2570.20 Table A.21: Ordered logit regression on piecewise polynomials relating satisfaction to normalized relative performance, fit to time goals elicited pre- and post-marathon. The statistical significance of coefficients is indicated by: *** (p <.01), ** (p <.05), and * (p <.10). Pre-Marathon Goal Post-Marathon Goal Slopes Levels Slopes Levels x loss gain p-value loss gain p-value loss gain p-value loss gain p-value 0.0% 13.93 8.69 0.610 1.13 0.00 0.000 15.89 23.68 0.557 1.08 0.00 0.000 0.5% 13.81 8.51 0.586 1.20 0.04 0.000 15.73 21.63 0.625 1.16 0.11 0.001 1.0% 13.68 8.32 0.559 1.26 0.09 0.001 15.56 19.58 0.714 1.24 0.22 0.005 1.5% 13.55 8.14 0.531 1.33 0.13 0.002 15.39 17.53 0.828 1.31 0.31 0.013 2.0% 13.43 7.96 0.501 1.40 0.17 0.004 15.22 15.48 0.977 1.39 0.39 0.025 2.5% 13.30 7.77 0.468 1.47 0.21 0.006 15.05 13.43 0.836 1.47 0.46 0.037 3.0% 13.17 7.59 0.435 1.53 0.24 0.009 14.89 11.38 0.617 1.54 0.53 0.048 3.5% 13.05 7.41 0.400 1.60 0.28 0.012 14.72 9.33 0.393 1.62 0.58 0.056 4.0% 12.92 7.23 0.365 1.66 0.32 0.015 14.55 7.28 0.210 1.69 0.62 0.059 4.5% 12.80 7.04 0.331 1.73 0.35 0.018 14.38 5.23 0.100 1.76 0.65 0.058 5.0% 12.67 6.86 0.299 1.79 0.39 0.021 14.22 3.18 0.050 1.83 0.67 0.054 Table A.22: Tests of loss aversion for participants in goal-asked condition using pre-marathon and postmarathon goal. Loss and gain slopes (S ( x) and S (x)) and loss and gain levels (S(0) S( x) and S(x) S(0)) at different values of x for the quadratic model. The p-values are obtained by Wald tests that examine the null hypothesis of equality of the loss and gain slopes and the loss and gain levels. 20

A.3.7 Effect of Prior Goal Elicitation In Section A.3.6, we compared the pre- and post-marathon goals reported by participants in the goal-asked conditions. Here we compare those in the goal-asked and goal-not-asked conditions. We hypothesized that asking participants to report a goal ahead of the marathon may increase commitment to that goal, and in turn increase reference dependence. To test this conjecture, we refit the quadratic model using postmarathon goals (as in the main analysis), but with the addition of a dummy variable indicating whether the participant had also been asked to provide a goal prior to the marathon and the interactions of that dummy variable with the other terms in the model. We find no significant effect of having elicited a goal prior to the marathon on the difference between gain and loss levels from x = 0-5% (χ 2 (1) = 0.52, p =.471, at x = 5%). There is also no significant effect on diminishing sensitivity in either gains (e.g., χ 2 (1) = 0.82, p =.365) or losses (χ 2 (1) = 0.14, p =.709). A.3.8 Timing of Satisfaction Elicitation Recall also that we varied the timing of the elicitation of post-marathon satisfaction. Some participants (66.1% of our sample) received the post-marathon survey 1 day after the marathon and others (33.9% of our sample) received the survey 2 weeks after the marathon. One possibility is that reference dependence is more pronounced immediately after the marathon, when comparisons to one s time goal are most salient. To test for this possibility, we refit the quadratic model, adding a dummy variable to indicate whether the participant s goal had been elicited late (2 weeks rather than 1 day post-marathon), and the interactions of the dummy variable with the other terms in the model. There is no significant effect of the timing of satisfaction elicitation on the difference between gain and loss levels from x = 0-5% (e.g., χ 2 (1) = 1.09, p =.297, at x = 5%). There is a significant, positive effect of eliciting satisfaction later on diminishing sensitivity in gains (χ 2 (1) = 8.54, p =.003), but not in losses (χ 2 (1) = 0.00, p =.983). A.3.9 Moderation by Goal Importance and Runner Experience We have shown that satisfaction exhibits loss aversion, both in the form of a jump at the reference point and a steeper slope in losses than in gains. In this section, we investigate how experience and the importance of the time goal moderate that relationship. Our experience measure is whether a runner had previously completed a marathon. There is a small but significant positive correlation between rating of goal importance and marathon experience (Spearman ρ = 0.05, p =.034), and thus we test both moderations within the same model. We perform a hierarchical regression analysis to understand how these factors moderate satisfaction. First, we adapt the piecewise quadratic polynomial model, adding goal importance as a continuous variable and a marathon experience as dummy variable. Second, we include the interactions of goal importance with the linear and quadratic loss terms as well as with the jump. We also enter interactions of marathon experience with the same terms. We omit interactions of goal importance or experience with the gain terms because of the relatively sparse data for gains. 7 Finally, we enter all three-way interaction terms. The coefficients on the fitted models and likelihood-ratio tests comparing these models are presented in Table A.24. The full model that includes all 3-way interactions provides a significantly better fit than either of the simpler models. Figure A.6 plots the full model s effect on the jump at the reference point, the loss slope at x = -5%, and the loss level at x = -5% as a function of both goal importance and marathon experience. The plots reveal what appears to be: (i) no systematic effect of goal importance or experience on jump; (ii) a main effect of goal importance on loss slope at x = -5%, with no clear effect of experience; and (iii) a less pronounced main effect of goal importance on loss level at x = -5%, with no clear effect of experience. Statistical analyses provide support for these observations. Table A.24 shows that there are significant positive effects of goal importance on both the linear and quadratic terms in the loss domain, but an insignificant interaction with the jump at the reference point. Recall that the slope in the gain domain is 7 In addition, relative performance is positively correlated with reported goal importance (Spearman ρ = 0.10, p <.001). 51.7% of the participants who exceeded their goal reported a goal importance of 7, while only 7.1% reported a goal importance of 4 or less, making it difficult to detect an effect of the moderating variables on the shape of the function above the reference point. 21