Model Selection Erwan Le Pennec Fall 2015

Similar documents
Navigate to the golf data folder and make it your working directory. Load the data by typing

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income

Statistical Analyses on Roger Federer s Performances in 2013 and 2014 James Kong,

Pitching Performance and Age

Minimal influence of wind and tidal height on underwater noise in Haro Strait

El Cerrito Sporting Goods Ira Sharenow January 7, 2019

United States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic,

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

Pitching Performance and Age

ASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era. Gary Evans Stat 201B Winter, 2010

Measuring Batting Performance

Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Evaluating and Classifying NBA Free Agents

Addendum to SEDAR16-DW-22

USING DELTA-GAMMA GENERALIZED LINEAR MODELS TO STANDARDIZE CATCH RATES OF YELLOWFIN TUNA CAUGHT BY BRAZILIAN BAIT-BOATS

Standardized catch rates of yellowtail snapper ( Ocyurus chrysurus

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Empirical Example II of Chapter 7

Lab 11: Introduction to Linear Regression

Cami T. McCandless and Joesph J. Mello SEDAR39- DW June 2014

SEDAR52-WP November 2017

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

Special Topics: Data Science

Case Studies Homework 3

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

A Shallow Dive into Deep Sea Data Sarah Solie and Arielle Fogel 7/18/2018

SPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction

Copy of my report. Why am I giving this talk. Overview. State highway network

Predictors for Winning in Men s Professional Tennis

Lecture 5. Optimisation. Regularisation

Standardized catch rates of Atlantic king mackerel (Scomberomorus cavalla) from the North Carolina Commercial fisheries trip ticket.

BBS Fall Conference, 16 September Use of modeling & simulation to support the design and analysis of a new dose and regimen finding study

ANOVA - Implementation.

Sample Final Exam MAT 128/SOC 251, Spring 2018

Updated and revised standardized catch rate of blue sharks caught by the Taiwanese longline fishery in the Indian Ocean

Development of Decision Support Tools to Assess Pedestrian and Bicycle Safety: Development of Safety Performance Function

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Modelling Exposure at Default Without Conversion Factors for Revolving Facilities

Predicting the use of the Sacrifice Bunt in Major League Baseball. Charlie Gallagher Brian Gilbert Neelay Mehta Chao Rao

Factorial Analysis of Variance

NFL Direction-Oriented Rushing O -Def Plus-Minus

PRESENTS. Solder & Oven Profiles Critical Process Variables

Systematic Review and Meta-analysis of Bicycle Helmet Efficacy to Mitigate Head, Face and Neck Injuries

Mining and Agricultural Productivity

Reproducible Research: Peer Assessment 1

Biomechanics and Models of Locomotion

Statistical and Econometric Methods for Transportation Data Analysis

Session 2: Introduction to Multilevel Modeling Using SPSS

Sensing and Modeling of Terrain Features using Crawling Robots

Football Player s Performance and Market Value

Supporting Online Material for

Evaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie

STAT 625: 2000 Olympic Diving Exploration

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

dplyr & Functions stat 480 Heike Hofmann

EXST7015: Salaries of all American league baseball players (1994) Salaries in thousands of dollars RAW DATA LISTING

Bayesian model averaging with change points to assess the impact of vaccination and public health interventions

Legendre et al Appendices and Supplements, p. 1

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

Sports Predictive Analytics: NFL Prediction Model

New Technology used in sports. By:- ABKASH AGARWAL REGD NO BRANCH CSE(A)

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Predicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007

Trouble With The Curve: Improving MLB Pitch Classification

Remote Towers: Videopanorama Framerate Requirements Derived from Visual Discrimination of Deceleration During Simulated Aircraft Landing

Online Diagnosis of Engine Dyno Test Benches: A Possibilistic Approach

Is lung capacity affected by smoking, sport, height or gender. Table of contents

PREDICTING the outcomes of sporting events

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

Chapter 12 Practice Test

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Package mrchmadness. April 9, 2017

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Paua research diver survey: review of data collected and simulation study of survey method

Standardized catch rates of U.S. blueline tilefish (Caulolatilus microps) from commercial logbook longline data

ISyE 6414 Regression Analysis

1 PIPESYS Application

A computer program that improves its performance at some task through experience.

Constantinos Antoniou and Konstantinos Papoutsis

Hellgate 100k Race Analysis February 12, 2015

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

One-factor ANOVA by example

Reshaping data in R. Duncan Golicher. December 9, 2008

CONTRADICTORY CATCH RATES OF BLUE SHARK CAUGHT IN ATLANTIC OCEAN BY BRAZILIAN LONG-LINE FLEET AS ESTIMATED USING GENERALIZED LINEAR MODELS

Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

knn & Naïve Bayes Hongning Wang

THE INTEGRATION OF THE SEA BREAM AND SEA BASS MARKET: EVIDENCE FROM GREECE AND SPAIN

Preparation for Salinity Control ME 121

Quantitative Methods for Economics Tutorial 6. Katherine Eyal

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

Driver Behavior at Highway-Rail Grade Crossings With Passive Traffic Controls

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

Modelling residential prices with cointegration techniques and automatic selection algorithms

ACCIDENT MODIFICATION FACTORS FOR MEDIANS ON FREEWAYS AND MULTILANE RURAL HIGHWAYS IN TEXAS

Stat 5100 Handout #27 SAS: Variations on Ordinary Least Squares (LASSO and Elastic Net)

Robust Imputation of Missing Values in Compositional Data Using the R-Package robcompositions

CPUE standardization of black marlin (Makaira indica) caught by Taiwanese large scale longline fishery in the Indian Ocean

Transcription:

Model Selection Erwan Le Pennec Fall 2015 library("dplyr") library("ggplot2") library("ggfortify") library("reshape2") Model Selection We will now use another classical dataset birthwt which corresponds to a study on risk factors associated with low infant birth weight conducted at Baystate Medical Center, Springfield, Mass during 1986. It consists of 189 observations of 10 variables. Variable low age lwt race smoke ptl ht ui ftv bwt Content indicator of birth weight less than 2.5 kg. mother s age in years. mother s weight in pounds at last menstrual period. mother s race (1 = white, 2 = black, 3 = other). smoking status during pregnancy. number of previous premature labors. history of hypertension. presence of uterine irritability. number of physician visits during the first trimester. birth weight in grams. Our goal will be to predict bwt, the birth weight, from all the other variables (except low!). 1. Load the dataset from the package MASS and inspect it with glimpse. lbw <- MASS::birthwt glimpse(lbw) Observations: 189 Variables: 10 $ low (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... $ age (int) 19, 33, 20, 21, 18, 21, 22, 17, 29, 26, 19, 19, 22, 30,... $ lwt (int) 182, 155, 105, 108, 107, 124, 118, 103, 123, 113, 95, 15... $ race (int) 2, 3, 1, 1, 1, 3, 1, 3, 1, 1, 3, 3, 3, 3, 1, 1, 2, 1, 3,... $ smoke (int) 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0,... $ ptl (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,... $ ht (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,... $ ui (int) 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,... $ ftv (int) 0, 3, 1, 2, 0, 0, 1, 1, 1, 0, 0, 1, 0, 2, 0, 0, 0, 3, 0,... $ bwt (int) 2523, 2551, 2557, 2594, 2600, 2622, 2637, 2637, 2663, 26... 1

2. Fix the different factor issues. lbw <- mutate(lbw, low = factor(low, levels = c(0,1), labels = c("normal", "low"))) lbw <- mutate(lbw, race = factor(race, levels = c(1,2,3), labels = c("white", "black", "other"))) lbw <- mutate(lbw, smoke = factor(smoke, levels = c(0,1), labels = c("no","yes"))) lbw <- mutate(lbw, ht = factor(ht, levels = c(0,1), labels = c("no","yes"))) lbw <- mutate(lbw, ui = factor(ui, levels = c(0,1), labels = c("no","yes"))) lbw <- select(lbw, -low) glimpse(lbw) Observations: 189 Variables: 9 $ age (int) 19, 33, 20, 21, 18, 21, 22, 17, 29, 26, 19, 19, 22, 30,... $ lwt (int) 182, 155, 105, 108, 107, 124, 118, 103, 123, 113, 95, 15... $ race (fctr) black, other, white, white, white, other, white, other,... $ smoke (fctr) no, no, yes, yes, yes, no, no, no, yes, yes, no, no, no... $ ptl (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,... $ ht (fctr) no, no, no, no, no, no, no, no, no, no, no, no, yes, no... $ ui (fctr) yes, no, no, yes, yes, no, no, no, no, no, no, no, no,... $ ftv (int) 0, 3, 1, 2, 0, 0, 1, 1, 1, 0, 0, 1, 0, 2, 0, 0, 0, 3, 0,... $ bwt (int) 2523, 2551, 2557, 2594, 2600, 2622, 2637, 2637, 2663, 26... 3. Verify that the dataset does not contain any missing values. summary(lbw) age lwt race smoke ptl Min. :14.00 Min. : 80.0 white:96 no :115 Min. :0.0000 1st Qu.:19.00 1st Qu.:110.0 black:26 yes: 74 1st Qu.:0.0000 Median :23.00 Median :121.0 other:67 Median :0.0000 Mean :23.24 Mean :129.8 Mean :0.1958 3rd Qu.:26.00 3rd Qu.:140.0 3rd Qu.:0.0000 Max. :45.00 Max. :250.0 Max. :3.0000 ht ui ftv bwt no :177 no :161 Min. :0.0000 Min. : 709 yes: 12 yes: 28 1st Qu.:0.0000 1st Qu.:2414 Median :0.0000 Median :2977 Mean :0.7937 Mean :2945 3rd Qu.:1.0000 3rd Qu.:3487 Max. :6.0000 Max. :4990 4. Inspect visually all the variables independently. for (name in names(lbw)) { print(qplot(data = lbw, get(name), xlab = name)) } 2

15 count 10 5 0 20 30 40 age 30 20 count 10 0 100 150 200 250 lwt 3

100 75 count 50 25 0 120 white black other race 90 count 60 30 0 no smoke yes 4

150 100 count 50 0 0 1 2 3 ptl 150 count 100 50 0 no ht yes 5

150 100 count 50 0 100 no ui yes 75 count 50 25 0 0 2 4 6 ftv 6

15 10 count 5 0 1000 2000 3000 4000 5000 bwt 5. Inspect visually the relation between every variable and bwt. Can you infer the most useful variables? for (name in names(lbw)[-9]) { if (class(lbw[[name]])=="factor") { print(ggplot(data = lbw, aes_string(x = name, y = "bwt")) + geom_boxplot() + geom_point(position = position_jitter(width =.1))) } } else { print(ggplot(data = lbw, aes_string(x = name, y = "bwt")) + geom_point(position = position_jitter(width =.1)) + geom_smooth()) } 7

6000 4000 bwt 2000 20 30 40 age 5000 4000 bwt 3000 2000 1000 100 150 200 250 lwt 8

5000 4000 bwt 3000 2000 1000 5000 white black other race 4000 bwt 3000 2000 1000 no smoke yes 9

5000 4000 bwt 3000 2000 1000 0 1 2 3 ptl 5000 4000 bwt 3000 2000 1000 no ht yes 10

5000 4000 bwt 3000 2000 1000 5000 no ui yes 4000 bwt 3000 2000 1000 0 2 4 6 ftv 6. Compute the full regression with all the variables and compute its summary (and maybe its diagnostic plots). 11

reglbw <- lm(bwt ~., data = lbw) summary(reglbw) Call: lm(formula = bwt ~., data = lbw) Residuals: Min 1Q Median 3Q Max -1825.26-435.21 55.91 473.46 1701.20 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 2927.962 312.904 9.357 < 2e-16 *** age -3.570 9.620-0.371 0.711012 lwt 4.354 1.736 2.509 0.013007 * raceblack -488.428 149.985-3.257 0.001349 ** raceother -355.077 114.753-3.094 0.002290 ** smokeyes -352.045 106.476-3.306 0.001142 ** ptl -48.402 101.972-0.475 0.635607 htyes -592.827 202.321-2.930 0.003830 ** uiyes -516.081 138.885-3.716 0.000271 *** ftv -14.058 46.468-0.303 0.762598 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 650.3 on 179 degrees of freedom Multiple R-squared: 0.2427, Adjusted R-squared: 0.2047 F-statistic: 6.376 on 9 and 179 DF, p-value: 7.891e-08 autoplot(reglbw) 12

Residuals 1000 0 1000 2000 Residuals vs Fitted 130 132 136 2000 2500 3000 3500 Fitted values Standardized residuals 3 2 1 0 1 2 3 136 132 Normal Q Q 130 3 2 1 0 1 2 3 Theoretical Quantiles Standardized residuals 1.5 1.0 0.5 Scale Location 132 136 130 2000 2500 3000 3500 Fitted values Standardized Residuals 3 2 1 0 1 2 3 Residuals vs Leverage 130 94 133 0.0 0.1 0.2 Leverage 7. Compute the trivial regression with no variables but the intercept as a reference of a _bad_method reglbwtriv <- lm(bwt ~ 1, data = lbw) summary(reglbwtriv) Call: lm(formula = bwt ~ 1, data = lbw) Residuals: Min 1Q Median 3Q Max -2235.59-530.59 32.41 542.41 2045.41 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 2944.59 53.04 55.51 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 729.2 on 188 degrees of freedom 7. Create a function that given a lm model computes the empirical error, the debiased error, the cross validation error, the deviance ( 2 Log-likelihood), the AIC criteria and the BIC criteria. V <- 5 LbwFolds <- caret::createmultifolds(lbw[["bwt"]], k = V, times = T) 13

computeerrlm <- function(model, name) { err <- mean((lbw[["bwt"]]-predict(model))^2) errcp <- err * ( 1 + 2 * length(model[["coefficients"]]) / nrow(lbw)) errcvtmp <- matrix(0, nrow = 1, ncol = (T*V)) for (v in 1: (T*V)) { lbwtrain <- slice(lbw, LbwFolds[[v]]) lbwtest <- slice(lbw, -LbwFolds[[v]]) regtmp <- lm(model, data = lbwtrain) predtmp <- predict(regtmp, newdata = lbwtest) errcvtmp[v] <- mean((lbwtest[["bwt"]]-predtmp)^2) } errcv <- mean(errcvtmp) errcvup <- errcv + 2 * sd(errcvtmp) / sqrt(t*v) LogLik <- -2 * loglik(model) LogLikAIC <- AIC(model) LogLikBIC <- BIC(model) } data.frame( method = name, err = err, errcp = errcp, errcv = errcv, errcvup = errcvup, LogLik = LogLik, LogLikAIC = LogLikAIC, LogLikBIC = LogLikBIC) 8. Compute the errors of the trivial and the full model. errs <- computeerrlm(reglbwtriv, "Trivial") errs <- rbind(errs, computeerrlm(reglbw, "Full")) errs method err errcp errcv errcvup LogLik LogLikAIC LogLikBIC 1 Trivial 528940.0 534537.2 529334.1 616085.1 3027.120 3031.120 3037.603 2 Full 400541.4 442926.7 457825.2 490557.5 2974.567 2996.567 3032.226 9. Create a function that takes a data frame of errors for possibly several models and plot them. Test it on the full model. Plot_Err <- function(errs) { ggplot(data = melt(select(errs, -matches("loglik"))), aes(x = method, y = value, color = variable)) + geom_point(size = 5) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) } Plot_Err(errs) 14

600000 value 550000 500000 variable err errcp errcv errcvup 450000 400000 Trivial method Full Plot_LogLik <- function(errs) { ggplot(data = melt(select(errs, -matches("err"))), aes(x = method, y = value, color = variable)) + geom_point(size = 5) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) } Plot_LogLik(errs) 15

3040 3020 value 3000 variable LogLik LogLikAIC LogLikBIC 2980 Trivial method Full 10. According to the summary, which variables can be removed from the model? Test this assumption by removing them, computing the errors and ploting them for the two models. reglbw2 <- update(reglbw, ~. - age - ptl -ftv) summary(reglbw2) Call: lm(formula = bwt ~ lwt + race + smoke + ht + ui, data = lbw) Residuals: Min 1Q Median 3Q Max -1842.14-433.19 67.09 459.21 1631.03 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 2837.264 243.676 11.644 < 2e-16 *** lwt 4.242 1.675 2.532 0.012198 * raceblack -475.058 145.603-3.263 0.001318 ** raceother -348.150 112.361-3.099 0.002254 ** smokeyes -356.321 103.444-3.445 0.000710 *** htyes -585.193 199.644-2.931 0.003810 ** uiyes -525.524 134.675-3.902 0.000134 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 645.9 on 182 degrees of freedom 16

Multiple R-squared: 0.2404, Adjusted R-squared: 0.2154 F-statistic: 9.6 on 6 and 182 DF, p-value: 3.601e-09 errs <- rbind(errs, computeerrlm(reglbw2, "Simplified")) Plot_Err(errs) 600000 value 550000 500000 variable err errcp errcv errcvup 450000 400000 Trivial Full method Simplified Plot_LogLik(errs) 17

3040 3020 value 3000 variable LogLik LogLikAIC LogLikBIC 2980 Trivial Full method Simplified Find_Best <- function(errs) { nameserr <- names(errs)[-1] for (nameerr in nameserr) { writelines(strwrap(paste(nameerr, ": ", errs[["method"]][which.min(errs[[nameerr]])], "(",min(errs[[nameerr]], na.rm =TRUE),")"))) } } Find_Best(errs) err : Full ( 400541.359746837 ) errcp : Simplified ( 431547.646386076 ) errcv : Simplified ( 440959.518704279 ) errcvup : Simplified ( 469381.57651398 ) LogLik : Full ( 2974.56693222411 ) LogLikAIC : Simplified ( 2991.15319687154 ) LogLikBIC : Simplified ( 3017.08717299202 ) 11. What would be the next simplification? Is it efficient? reglbw3 <- update(reglbw2, ~. - lwt) summary(reglbw3) Call: 18

lm(formula = bwt ~ race + smoke + ht + ui, data = lbw) Residuals: Min 1Q Median 3Q Max -1828.68-452.50 46.24 447.24 1577.24 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 3412.76 89.06 38.321 < 2e-16 *** raceblack -425.06 146.37-2.904 0.004139 ** raceother -409.26 111.35-3.676 0.000312 *** smokeyes -386.20 104.28-3.704 0.000281 *** htyes -472.33 197.46-2.392 0.017768 * uiyes -563.09 135.82-4.146 5.17e-05 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 655.4 on 183 degrees of freedom Multiple R-squared: 0.2136, Adjusted R-squared: 0.1922 F-statistic: 9.944 on 5 and 183 DF, p-value: 1.98e-08 errs <- rbind(errs, computeerrlm(reglbw3, "Simplified2")) Plot_Err(errs) 600000 value 550000 500000 variable err errcp errcv errcvup 450000 400000 Trivial Full method Simplified Simplified2 Plot_LogLik(errs) 19

3040 3020 value 3000 variable LogLik LogLikAIC LogLikBIC 2980 Trivial Full method Simplified Simplified2 Find_Best(errs) err : Full ( 400541.359746837 ) errcp : Simplified ( 431547.646386076 ) errcv : Simplified ( 440959.518704279 ) errcvup : Simplified ( 469381.57651398 ) LogLik : Full ( 2974.56693222411 ) LogLikAIC : Simplified ( 2991.15319687154 ) LogLikBIC : Simplified ( 3017.08717299202 ) 12. Use glmulti from the package of the same name to test all the possible variable subset without any interaction. (Use the level = 1 option!) What is the best model according to the AIC criterion. library(glmulti) bests <- glmulti(bwt ~., data = lbw, level = 1, family = "gaussian", plotty = FALSE) #You may use plott Initialization... TASK: Exhaustive screening of candidate set. Fitting... After 50 models: Best model: bwt~1+race+smoke Crit= 3012.22332294019 Mean crit= 3025.5948182594 After 100 models: 20

Best model: bwt~1+race+smoke+lwt Crit= 3008.82234095073 Mean crit= 3022.61344196622 After 150 models: Best model: bwt~1+race+smoke+ht+lwt Crit= 3004.33890778711 Mean crit= 3017.10903905785 After 200 models: Best model: bwt~1+race+smoke+ui+lwt Crit= 2997.87126280367 Mean crit= 3011.12596797514 After 250 models: Best model: bwt~1+race+smoke+ht+ui Crit= 2995.69454012035 Mean crit= 3007.6193065738 Completed. 13. bests@formulas contains a list of the best models. Use this to compute all the errors for the 25 bests models. Compare those errors with those of our naive attempts. errmulti <- data.frame() for (f in 1:50) { model <- lm(bests@formulas[[f]], data = lbw) errmulti <- rbind(errmulti, computeerrlm(model,sprintf("best_%g",f))) } errs_multi <- rbind(errs, errmulti) Plot_Err(errs_multi) 21

400000 450000 500000 550000 600000 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 method value variable err errcp errcv errcvup Plot_LogLik(errs_multi) 2980 3000 3020 3040 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 method value variable LogLik LogLikAIC LogLikBIC 22

Find_Best(errmulti) err : Best_9 ( 400541.359746837 ) errcp : Best_1 ( 431547.646386076 ) errcv : Best_1 ( 440959.518704279 ) errcvup : Best_4 ( 468678.781237941 ) LogLik : Best_9 ( 2974.56693222411 ) LogLikAIC : Best_1 ( 2991.15319687154 ) LogLikBIC : Best_1 ( 3017.08717299202 ) Find_Best(errs_multi) err : Full ( 400541.359746837 ) errcp : Simplified ( 431547.646386076 ) errcv : Simplified ( 440959.518704279 ) errcvup : Best_4 ( 468678.781237941 ) LogLik : Full ( 2974.56693222411 ) LogLikAIC : Simplified ( 2991.15319687154 ) LogLikBIC : Simplified ( 3017.08717299202 ) 14. Add the interaction of level 2 and use glmulti with method = d to find the number of model. Is the exhaustive search possible? glmulti(bwt ~., data = lbw, level = 2, family = "gaussian", method ="d", plotty = FALSE) #You may use p Initialization... TASK: Diagnostic of candidate set. Sample size: 189 4 factor(s). 4 covariate(s). 0 f exclusion(s). 0 c exclusion(s). 0 f:f exclusion(s). 0 c:c exclusion(s). 0 f:c exclusion(s). Size constraints: min = 0 max = -1 Complexity constraints: min = 0 max = -1 Your candidate set contains 604023872 models. [1] 604023872 15. Use the genetic algorithm of glmulti (method = g ) to explore those models and examine the best 25 solutions. bestsgen <- glmulti(bwt ~., data = lbw, level = 2, family = "gaussian", method ="g", plotty = FALSE) #Y Initialization... TASK: Genetic algorithm in the candidate set. Initialization... Algorithm started... 23

After 10 generations: Best model: bwt~1+race+smoke+ht+ui+age+lwt+ptl+lwt:age+ptl:age+ftv:ptl+smoke:age+smoke:ptl+ht:age+ht: Crit= 2992.85809888721 Mean crit= 3009.61656363447 Change in best IC: -7007.14190111279 / Change in mean IC: -6990.38343636554 After 20 generations: Best model: bwt~1+race+smoke+ui+age+lwt+ptl+ftv+lwt:age+ptl:age+ptl:lwt+ftv:age+smoke:age+smoke:ptl+h Crit= 2988.96959596122 Mean crit= 3005.6665829892 Change in best IC: -3.88850292598408 / Change in mean IC: -3.94998064526408 After 30 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 3003.28156987427 Change in best IC: -3.33778420763156 / Change in mean IC: -2.38501311492746 After 40 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 3002.565158201 Change in best IC: 0 / Change in mean IC: -0.716411673271068 After 50 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 3000.9040135337 Change in best IC: 0 / Change in mean IC: -1.66114466730187 After 60 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 3000.24302059568 Change in best IC: 0 / Change in mean IC: -0.660992938021536 After 70 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 3000.00269586114 Change in best IC: 0 / Change in mean IC: -0.2403247345419 After 80 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 2999.55944024434 Change in best IC: 0 / Change in mean IC: -0.443255616795341 After 90 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 2999.23923957031 Change in best IC: 0 / Change in mean IC: -0.320200674027546 24

After 100 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= 2985.63181175359 Mean crit= 2998.87696117126 Change in best IC: 0 / Change in mean IC: -0.362278399058141 After 110 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2985.47181090411 Mean crit= 2998.47527431491 Change in best IC: -0.16000084947791 / Change in mean IC: -0.401686856343076 After 120 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2997.87105632815 Change in best IC: -0.772330790574415 / Change in mean IC: -0.604217986768163 After 130 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2997.30481974195 Change in best IC: 0 / Change in mean IC: -0.566236586199466 After 140 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2996.88033490373 Change in best IC: 0 / Change in mean IC: -0.424484838210901 After 150 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2996.6677981896 Change in best IC: 0 / Change in mean IC: -0.212536714130692 After 160 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2996.26211688134 Change in best IC: 0 / Change in mean IC: -0.405681308261137 After 170 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2996.21431736516 Change in best IC: 0 / Change in mean IC: -0.0477995161845683 After 180 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2995.91535426962 Change in best IC: 0 / Change in mean IC: -0.298963095537147 25

After 190 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2995.76431946929 Change in best IC: 0 / Change in mean IC: -0.151034800329398 After 200 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2995.34231398779 Change in best IC: 0 / Change in mean IC: -0.422005481504584 After 210 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.93477661566 Change in best IC: 0 / Change in mean IC: -0.407537372126171 After 220 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.78115536105 Change in best IC: 0 / Change in mean IC: -0.15362125460706 After 230 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.57508582484 Change in best IC: 0 / Change in mean IC: -0.206069536214272 After 240 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.53417864596 Change in best IC: 0 / Change in mean IC: -0.0409071788781148 After 250 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.5030988968 Change in best IC: 0 / Change in mean IC: -0.0310797491633821 After 260 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.49010106458 Change in best IC: 0 / Change in mean IC: -0.0129978322183888 After 270 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.36999030454 Change in best IC: 0 / Change in mean IC: -0.120110760038642 26

After 280 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.15050207592 Change in best IC: 0 / Change in mean IC: -0.219488228620321 After 290 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2994.06916560256 Change in best IC: 0 / Change in mean IC: -0.0813364733562594 After 300 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.89587074507 Change in best IC: 0 / Change in mean IC: -0.173294857490873 After 310 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.80985526538 Change in best IC: 0 / Change in mean IC: -0.0860154796928327 After 320 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.78763355017 Change in best IC: 0 / Change in mean IC: -0.0222217152145276 After 330 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.56593141684 Change in best IC: 0 / Change in mean IC: -0.221702133324015 After 340 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.28730982022 Change in best IC: 0 / Change in mean IC: -0.278621596621178 After 350 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.27576897212 Change in best IC: 0 / Change in mean IC: -0.0115408480983206 After 360 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.25941438693 Change in best IC: 0 / Change in mean IC: -0.016354585190129 27

After 370 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.25941438693 Change in best IC: 0 / Change in mean IC: 0 After 380 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2993.21047034041 Change in best IC: 0 / Change in mean IC: -0.0489440465207736 After 390 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.96379965618 Change in best IC: 0 / Change in mean IC: -0.246670684230594 After 400 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.83238201074 Change in best IC: 0 / Change in mean IC: -0.131417645437068 After 410 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.71645905455 Change in best IC: 0 / Change in mean IC: -0.115922956193117 After 420 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.71261931747 Change in best IC: 0 / Change in mean IC: -0.00383973708221674 After 430 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.41611622275 Change in best IC: 0 / Change in mean IC: -0.296503094715263 After 440 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.27169857193 Change in best IC: 0 / Change in mean IC: -0.144417650827108 After 450 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.23625413153 Change in best IC: 0 / Change in mean IC: -0.0354444404010792 28

After 460 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.21427443729 Change in best IC: 0 / Change in mean IC: -0.0219796942342327 After 470 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.21427443729 Change in best IC: 0 / Change in mean IC: 0 After 480 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.21323694936 Change in best IC: 0 / Change in mean IC: -0.00103748793526393 After 490 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.10850336499 Change in best IC: 0 / Change in mean IC: -0.104733584370933 After 500 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.10850336499 Change in best IC: 0 / Change in mean IC: 0 After 510 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.07867681513 Change in best IC: 0 / Change in mean IC: -0.0298265498504406 After 520 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.02636056815 Change in best IC: 0 / Change in mean IC: -0.0523162469894487 After 530 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.00261797083 Change in best IC: 0 / Change in mean IC: -0.0237425973145946 After 540 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2992.00261797083 Change in best IC: 0 / Change in mean IC: 0 29

After 550 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2991.97858793428 Change in best IC: 0 / Change in mean IC: -0.0240300365517214 After 560 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2991.96560573739 Change in best IC: 0 / Change in mean IC: -0.0129821968853321 After 570 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= 2984.69948011354 Mean crit= 2991.96560573739 Improvements in best and average IC have bebingo en below the specified goals. Algorithm is declared to have converged. Completed. errgen <- data.frame() for (f in 1:25) { model <- lm(bestsgen@formulas[[f]], data = lbw) errgen <- rbind(errgen, computeerrlm(model,sprintf("bestgen_%g",f))) } errs_gen <- rbind(errs_multi, errgen) Plot_Err(errs_gen) 30

4e+05 5e+05 6e+05 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 method value variable err errcp errcv errcvup Plot_LogLik(errs_gen) 2950 2975 3000 3025 3050 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 method value variable LogLik LogLikAIC LogLikBIC 31

Find_Best(errgen) err : BestGen_19 ( 349136.928352408 ) errcp : BestGen_1 ( 413992.797239044 ) errcv : BestGen_2 ( 413612.04629105 ) errcvup : BestGen_2 ( 441873.398664721 ) LogLik : BestGen_19 ( 2948.60724520876 ) LogLikAIC : BestGen_1 ( 2984.69948011354 ) LogLikBIC : BestGen_2 ( 3030.20793238515 ) Find_Best(errs_gen) err : BestGen_19 ( 349136.928352408 ) errcp : BestGen_1 ( 413992.797239044 ) errcv : BestGen_2 ( 413612.04629105 ) errcvup : BestGen_2 ( 441873.398664721 ) LogLik : BestGen_19 ( 2948.60724520876 ) LogLikAIC : BestGen_1 ( 2984.69948011354 ) LogLikBIC : Simplified ( 3017.08717299202 ) 16. Use glmnet to try a regularization method to obtain a best model. X <- model.matrix(bwt ~.^2-1, data = lbw) Y <- lbw[["bwt"]] library("glmnet") lbw_lasso <- glmnet(x, Y, family = "gaussian") coeffs_lbw_lasso <- cbind(data.frame(t(as.matrix(coef(lbw_lasso)))), lambda = lbw_lasso[["lambda"]]) ggplot(data = melt(coeffs_lbw_lasso, "lambda"), aes(x = lambda, y = value, color = variable)) + geom_lin 32

value 3000 2000 1000 0 1000 2000 0 50 100 150 200 lambda age.smokeyes age.ptl age.htyes age.uiyes age.ftv lwt.raceblack lwt.raceother lwt.smokeyes lwt.ptl lwt.htyes lwt.uiyes lwt.ftv raceblack.smokeyes raceother.smokeyes raceblack.ptl raceother.ptl raceblack.htyes raceother.htyes raceblack.uiyes computeerrglmnet <- function(model, lambda, name) { err <- mean((y-predict(model, X, lambda))^2) errcp <- err * ( 1 + 2 * (sum(abs(coef(model,lambda))>0)) / nrow(lbw)) errcvtmp <- matrix(0, nrow = 1, ncol = (T*V)) for (v in 1: (T*V)) { Xtrain <- X[LbwFolds[[v]],] Xtest <- X[-LbwFolds[[v]],] Ytrain <- Y[LbwFolds[[v]]] Ytest <- Y[-LbwFolds[[v]]] regtmp <- glmnet(xtrain, Ytrain, family = "gaussian", lambda = lambda) predtmp <- predict(regtmp, Xtest, lambda) errcvtmp[v] <- mean((ytest-predtmp)^2) } errcv <- mean(errcvtmp) errcvup <- errcv + 2 * sd(errcvtmp) / sqrt(t*v) } data.frame( method = name, err = err, errcp = errcp, errcv = errcv, errcvup = errcvup, LogLik = NA, LogLikAIC = NA, LogLikBIC = NA) computeerrlm2 <- function(model, name) { err <- mean((lbwint[["bwt"]]-predict(model))^2) errcp <- err * ( 1 + 2 * length(model[["coefficients"]]) / nrow(lbw)) errcvtmp <- matrix(0, nrow = 1, ncol = (T*V)) 33

for (v in 1: (T*V)) { lbwtrain <- slice(lbwint, LbwFolds[[v]]) lbwtest <- slice(lbwint, -LbwFolds[[v]]) regtmp <- lm(model, data = lbwtrain) predtmp <- predict(regtmp, newdata = lbwtest) errcvtmp[v] <- mean((lbwtest[["bwt"]]-predtmp)^2) } errcv <- mean(errcvtmp) errcvup <- errcv + 2 * sd(errcvtmp) / sqrt(t*v) LogLik <- -2 * loglik(model) LogLikAIC <- AIC(model) LogLikBIC <- BIC(model) } data.frame( method = name, err = err, errcp = errcp, errcv = errcv, errcvup = errcvup, LogLik = LogLik, LogLikAIC = LogLikAIC, LogLikBIC = LogLikBIC) errlambda <- data.frame() errlambdasup <- data.frame() dx <- data.frame(x) lbwint <- cbind(dx, bwt = Y) for (l in 1:length(lbw_lasso[["lambda"]])) { lambda <- lbw_lasso[["lambda"]][l] errlambda <- rbind(errlambda, computeerrglmnet(lbw_lasso, lambda, sprintf("lasso_%g",l))) subsetlambda <- which(abs(coef(lbw_lasso,lambda)[-1]) > 0) if (length(subsetlambda)>0) { reglambda <- lm(bwt ~., data = mutate(select(dx, subsetlambda), bwt = Y)) errlambdasup <- rbind(errlambdasup, computeerrlm2(reglambda, sprintf("lassosup_%g",l))) } } errs_lasso <- rbind(errs_gen, errlambda, errlambdasup) Plot_Err(errs_lasso) 34

0e+00 1e+07 2e+07 3e+07 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 Lasso_1 Lasso_2 Lasso_3 Lasso_4 Lasso_5 Lasso_6 Lasso_7 Lasso_8 Lasso_9 Lasso_10 Lasso_11 Lasso_12 Lasso_13 Lasso_14 Lasso_15 Lasso_16 Lasso_17 Lasso_18 Lasso_19 Lasso_20 Lasso_21 Lasso_22 Lasso_23 Lasso_24 Lasso_25 Lasso_26 Lasso_27 Lasso_28 Lasso_29 Lasso_30 Lasso_31 Lasso_32 Lasso_33 Lasso_34 Lasso_35 Lasso_36 Lasso_37 Lasso_38 Lasso_39 Lasso_40 Lasso_41 Lasso_42 Lasso_43 Lasso_44 Lasso_45 Lasso_46 Lasso_47 Lasso_48 Lasso_49 Lasso_50 Lasso_51 Lasso_52 Lasso_53 Lasso_54 Lasso_55 Lasso_56 Lasso_57 Lasso_58 Lasso_59 Lasso_60 Lasso_61 Lasso_62 Lasso_63 Lasso_64 Lasso_65 Lasso_66 Lasso_67 Lasso_68 Lasso_69 Lasso_70 Lasso_71 Lasso_72 Lasso_73 Lasso_74 Lasso_75 Lasso_76 Lasso_77 Lasso_78 Lasso_79 Lasso_80 Lasso_81 Lasso_82 Lasso_83 Lasso_84 Lasso_85 Lasso_86 Lasso_87 Lasso_88 Lasso_89 Lasso_90 Lasso_91 Lasso_92 Lasso_93 LassoSup_2 LassoSup_3 LassoSup_4 LassoSup_5 LassoSup_6 LassoSup_7 LassoSup_8 LassoSup_9 LassoSup_10 LassoSup_11 LassoSup_12 LassoSup_13 LassoSup_14 LassoSup_15 LassoSup_16 LassoSup_17 LassoSup_18 LassoSup_19 LassoSup_20 LassoSup_21 LassoSup_22 LassoSup_23 LassoSup_24 LassoSup_25 LassoSup_26 LassoSup_27 LassoSup_28 LassoSup_29 LassoSup_30 LassoSup_31 LassoSup_32 LassoSup_33 LassoSup_34 LassoSup_35 LassoSup_36 LassoSup_37 LassoSup_38 LassoSup_39 LassoSup_40 LassoSup_41 LassoSup_42 LassoSup_43 LassoSup_44 LassoSup_45 LassoSup_46 LassoSup_47 LassoSup_48 LassoSup_49 LassoSup_50 LassoSup_51 LassoSup_52 LassoSup_53 LassoSup_54 LassoSup_55 LassoSup_56 LassoSup_57 LassoSup_58 LassoSup_59 LassoSup_60 LassoSup_61 LassoSup_62 LassoSup_63 LassoSup_64 LassoSup_65 LassoSup_66 LassoSup_67 LassoSup_68 LassoSup_69 LassoSup_70 LassoSup_71 LassoSup_72 LassoSup_73 LassoSup_74 LassoSup_75 LassoSup_76 LassoSup_77 LassoSup_78 LassoSup_79 LassoSup_80 LassoSup_81 LassoSup_82 LassoSup_83 LassoSup_84 LassoSup_85 LassoSup_86 LassoSup_87 LassoSup_88 LassoSup_89 LassoSup_90 LassoSup_91 LassoSup_92 LassoSup_93 method value variable err errcp errcv errcvup Plot_LogLik(errs_lasso) 3000 3100 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 Lasso_1 Lasso_2 Lasso_3 Lasso_4 Lasso_5 Lasso_6 Lasso_7 Lasso_8 Lasso_9 Lasso_10 Lasso_11 Lasso_12 Lasso_13 Lasso_14 Lasso_15 Lasso_16 Lasso_17 Lasso_18 Lasso_19 Lasso_20 Lasso_21 Lasso_22 Lasso_23 Lasso_24 Lasso_25 Lasso_26 Lasso_27 Lasso_28 Lasso_29 Lasso_30 Lasso_31 Lasso_32 Lasso_33 Lasso_34 Lasso_35 Lasso_36 Lasso_37 Lasso_38 Lasso_39 Lasso_40 Lasso_41 Lasso_42 Lasso_43 Lasso_44 Lasso_45 Lasso_46 Lasso_47 Lasso_48 Lasso_49 Lasso_50 Lasso_51 Lasso_52 Lasso_53 Lasso_54 Lasso_55 Lasso_56 Lasso_57 Lasso_58 Lasso_59 Lasso_60 Lasso_61 Lasso_62 Lasso_63 Lasso_64 Lasso_65 Lasso_66 Lasso_67 Lasso_68 Lasso_69 Lasso_70 Lasso_71 Lasso_72 Lasso_73 Lasso_74 Lasso_75 Lasso_76 Lasso_77 Lasso_78 Lasso_79 Lasso_80 Lasso_81 Lasso_82 Lasso_83 Lasso_84 Lasso_85 Lasso_86 Lasso_87 Lasso_88 Lasso_89 Lasso_90 Lasso_91 Lasso_92 Lasso_93 LassoSup_2 LassoSup_3 LassoSup_4 LassoSup_5 LassoSup_6 LassoSup_7 LassoSup_8 LassoSup_9 LassoSup_10 LassoSup_11 LassoSup_12 LassoSup_13 LassoSup_14 LassoSup_15 LassoSup_16 LassoSup_17 LassoSup_18 LassoSup_19 LassoSup_20 LassoSup_21 LassoSup_22 LassoSup_23 LassoSup_24 LassoSup_25 LassoSup_26 LassoSup_27 LassoSup_28 LassoSup_29 LassoSup_30 LassoSup_31 LassoSup_32 LassoSup_33 LassoSup_34 LassoSup_35 LassoSup_36 LassoSup_37 LassoSup_38 LassoSup_39 LassoSup_40 LassoSup_41 LassoSup_42 LassoSup_43 LassoSup_44 LassoSup_45 LassoSup_46 LassoSup_47 LassoSup_48 LassoSup_49 LassoSup_50 LassoSup_51 LassoSup_52 LassoSup_53 LassoSup_54 LassoSup_55 LassoSup_56 LassoSup_57 LassoSup_58 LassoSup_59 LassoSup_60 LassoSup_61 LassoSup_62 LassoSup_63 LassoSup_64 LassoSup_65 LassoSup_66 LassoSup_67 LassoSup_68 LassoSup_69 LassoSup_70 LassoSup_71 LassoSup_72 LassoSup_73 LassoSup_74 LassoSup_75 LassoSup_76 LassoSup_77 LassoSup_78 LassoSup_79 LassoSup_80 LassoSup_81 LassoSup_82 LassoSup_83 LassoSup_84 LassoSup_85 LassoSup_86 LassoSup_87 LassoSup_88 LassoSup_89 LassoSup_90 LassoSup_91 LassoSup_92 LassoSup_93 method value variable LogLik LogLikAIC LogLikBIC 35

Find_Best(errlambdasup) err : LassoSup_54 ( 321435.329904942 ) errcp : LassoSup_7 ( 418143.703788796 ) errcv : LassoSup_7 ( 429198.915997658 ) errcvup : LassoSup_7 ( 449102.657381476 ) LogLik : LassoSup_54 ( 2932.98302764237 ) LogLikAIC : LassoSup_7 ( 2985.1897247821 ) LogLikBIC : LassoSup_5 ( 3011.02447818654 ) Find_Best(errs_lasso) err : LassoSup_54 ( 321435.329904942 ) errcp : BestGen_1 ( 413992.797239044 ) errcv : BestGen_2 ( 413612.04629105 ) errcvup : BestGen_2 ( 441873.398664721 ) LogLik : LassoSup_54 ( 2932.98302764237 ) LogLikAIC : BestGen_1 ( 2984.69948011354 ) LogLikBIC : LassoSup_5 ( 3011.02447818654 ) 17. Find a better model... 36