Model Selection Erwan Le Pennec Fall 2015
|
|
- Kimberly Armstrong
- 5 years ago
- Views:
Transcription
1 Model Selection Erwan Le Pennec Fall 2015 library("dplyr") library("ggplot2") library("ggfortify") library("reshape2") Model Selection We will now use another classical dataset birthwt which corresponds to a study on risk factors associated with low infant birth weight conducted at Baystate Medical Center, Springfield, Mass during It consists of 189 observations of 10 variables. Variable low age lwt race smoke ptl ht ui ftv bwt Content indicator of birth weight less than 2.5 kg. mother s age in years. mother s weight in pounds at last menstrual period. mother s race (1 = white, 2 = black, 3 = other). smoking status during pregnancy. number of previous premature labors. history of hypertension. presence of uterine irritability. number of physician visits during the first trimester. birth weight in grams. Our goal will be to predict bwt, the birth weight, from all the other variables (except low!). 1. Load the dataset from the package MASS and inspect it with glimpse. lbw <- MASS::birthwt glimpse(lbw) Observations: 189 Variables: 10 $ low (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... $ age (int) 19, 33, 20, 21, 18, 21, 22, 17, 29, 26, 19, 19, 22, 30,... $ lwt (int) 182, 155, 105, 108, 107, 124, 118, 103, 123, 113, 95, $ race (int) 2, 3, 1, 1, 1, 3, 1, 3, 1, 1, 3, 3, 3, 3, 1, 1, 2, 1, 3,... $ smoke (int) 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0,... $ ptl (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,... $ ht (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,... $ ui (int) 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,... $ ftv (int) 0, 3, 1, 2, 0, 0, 1, 1, 1, 0, 0, 1, 0, 2, 0, 0, 0, 3, 0,... $ bwt (int) 2523, 2551, 2557, 2594, 2600, 2622, 2637, 2637, 2663,
2 2. Fix the different factor issues. lbw <- mutate(lbw, low = factor(low, levels = c(0,1), labels = c("normal", "low"))) lbw <- mutate(lbw, race = factor(race, levels = c(1,2,3), labels = c("white", "black", "other"))) lbw <- mutate(lbw, smoke = factor(smoke, levels = c(0,1), labels = c("no","yes"))) lbw <- mutate(lbw, ht = factor(ht, levels = c(0,1), labels = c("no","yes"))) lbw <- mutate(lbw, ui = factor(ui, levels = c(0,1), labels = c("no","yes"))) lbw <- select(lbw, -low) glimpse(lbw) Observations: 189 Variables: 9 $ age (int) 19, 33, 20, 21, 18, 21, 22, 17, 29, 26, 19, 19, 22, 30,... $ lwt (int) 182, 155, 105, 108, 107, 124, 118, 103, 123, 113, 95, $ race (fctr) black, other, white, white, white, other, white, other,... $ smoke (fctr) no, no, yes, yes, yes, no, no, no, yes, yes, no, no, no... $ ptl (int) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,... $ ht (fctr) no, no, no, no, no, no, no, no, no, no, no, no, yes, no... $ ui (fctr) yes, no, no, yes, yes, no, no, no, no, no, no, no, no,... $ ftv (int) 0, 3, 1, 2, 0, 0, 1, 1, 1, 0, 0, 1, 0, 2, 0, 0, 0, 3, 0,... $ bwt (int) 2523, 2551, 2557, 2594, 2600, 2622, 2637, 2637, 2663, Verify that the dataset does not contain any missing values. summary(lbw) age lwt race smoke ptl Min. :14.00 Min. : 80.0 white:96 no :115 Min. : st Qu.: st Qu.:110.0 black:26 yes: 74 1st Qu.: Median :23.00 Median :121.0 other:67 Median : Mean :23.24 Mean :129.8 Mean : rd Qu.: rd Qu.: rd Qu.: Max. :45.00 Max. :250.0 Max. : ht ui ftv bwt no :177 no :161 Min. : Min. : 709 yes: 12 yes: 28 1st Qu.: st Qu.:2414 Median : Median :2977 Mean : Mean :2945 3rd Qu.: rd Qu.:3487 Max. : Max. : Inspect visually all the variables independently. for (name in names(lbw)) { print(qplot(data = lbw, get(name), xlab = name)) } 2
3 15 count age count lwt 3
4 count white black other race 90 count no smoke yes 4
5 count ptl 150 count no ht yes 5
6 count no ui yes 75 count ftv 6
7 15 10 count bwt 5. Inspect visually the relation between every variable and bwt. Can you infer the most useful variables? for (name in names(lbw)[-9]) { if (class(lbw[[name]])=="factor") { print(ggplot(data = lbw, aes_string(x = name, y = "bwt")) + geom_boxplot() + geom_point(position = position_jitter(width =.1))) } } else { print(ggplot(data = lbw, aes_string(x = name, y = "bwt")) + geom_point(position = position_jitter(width =.1)) + geom_smooth()) } 7
8 bwt age bwt lwt 8
9 bwt white black other race 4000 bwt no smoke yes 9
10 bwt ptl bwt no ht yes 10
11 bwt no ui yes 4000 bwt ftv 6. Compute the full regression with all the variables and compute its summary (and maybe its diagnostic plots). 11
12 reglbw <- lm(bwt ~., data = lbw) summary(reglbw) Call: lm(formula = bwt ~., data = lbw) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** age lwt * raceblack ** raceother ** smokeyes ** ptl htyes ** uiyes *** ftv Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 179 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 9 and 179 DF, p-value: 7.891e-08 autoplot(reglbw) 12
13 Residuals Residuals vs Fitted Fitted values Standardized residuals Normal Q Q Theoretical Quantiles Standardized residuals Scale Location Fitted values Standardized Residuals Residuals vs Leverage Leverage 7. Compute the trivial regression with no variables but the intercept as a reference of a _bad_method reglbwtriv <- lm(bwt ~ 1, data = lbw) summary(reglbwtriv) Call: lm(formula = bwt ~ 1, data = lbw) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 188 degrees of freedom 7. Create a function that given a lm model computes the empirical error, the debiased error, the cross validation error, the deviance ( 2 Log-likelihood), the AIC criteria and the BIC criteria. V <- 5 LbwFolds <- caret::createmultifolds(lbw[["bwt"]], k = V, times = T) 13
14 computeerrlm <- function(model, name) { err <- mean((lbw[["bwt"]]-predict(model))^2) errcp <- err * ( * length(model[["coefficients"]]) / nrow(lbw)) errcvtmp <- matrix(0, nrow = 1, ncol = (T*V)) for (v in 1: (T*V)) { lbwtrain <- slice(lbw, LbwFolds[[v]]) lbwtest <- slice(lbw, -LbwFolds[[v]]) regtmp <- lm(model, data = lbwtrain) predtmp <- predict(regtmp, newdata = lbwtest) errcvtmp[v] <- mean((lbwtest[["bwt"]]-predtmp)^2) } errcv <- mean(errcvtmp) errcvup <- errcv + 2 * sd(errcvtmp) / sqrt(t*v) LogLik <- -2 * loglik(model) LogLikAIC <- AIC(model) LogLikBIC <- BIC(model) } data.frame( method = name, err = err, errcp = errcp, errcv = errcv, errcvup = errcvup, LogLik = LogLik, LogLikAIC = LogLikAIC, LogLikBIC = LogLikBIC) 8. Compute the errors of the trivial and the full model. errs <- computeerrlm(reglbwtriv, "Trivial") errs <- rbind(errs, computeerrlm(reglbw, "Full")) errs method err errcp errcv errcvup LogLik LogLikAIC LogLikBIC 1 Trivial Full Create a function that takes a data frame of errors for possibly several models and plot them. Test it on the full model. Plot_Err <- function(errs) { ggplot(data = melt(select(errs, -matches("loglik"))), aes(x = method, y = value, color = variable)) + geom_point(size = 5) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) } Plot_Err(errs) 14
15 value variable err errcp errcv errcvup Trivial method Full Plot_LogLik <- function(errs) { ggplot(data = melt(select(errs, -matches("err"))), aes(x = method, y = value, color = variable)) + geom_point(size = 5) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) } Plot_LogLik(errs) 15
16 value 3000 variable LogLik LogLikAIC LogLikBIC 2980 Trivial method Full 10. According to the summary, which variables can be removed from the model? Test this assumption by removing them, computing the errors and ploting them for the two models. reglbw2 <- update(reglbw, ~. - age - ptl -ftv) summary(reglbw2) Call: lm(formula = bwt ~ lwt + race + smoke + ht + ui, data = lbw) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** lwt * raceblack ** raceother ** smokeyes *** htyes ** uiyes *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 182 degrees of freedom 16
17 Multiple R-squared: , Adjusted R-squared: F-statistic: 9.6 on 6 and 182 DF, p-value: 3.601e-09 errs <- rbind(errs, computeerrlm(reglbw2, "Simplified")) Plot_Err(errs) value variable err errcp errcv errcvup Trivial Full method Simplified Plot_LogLik(errs) 17
18 value 3000 variable LogLik LogLikAIC LogLikBIC 2980 Trivial Full method Simplified Find_Best <- function(errs) { nameserr <- names(errs)[-1] for (nameerr in nameserr) { writelines(strwrap(paste(nameerr, ": ", errs[["method"]][which.min(errs[[nameerr]])], "(",min(errs[[nameerr]], na.rm =TRUE),")"))) } } Find_Best(errs) err : Full ( ) errcp : Simplified ( ) errcv : Simplified ( ) errcvup : Simplified ( ) LogLik : Full ( ) LogLikAIC : Simplified ( ) LogLikBIC : Simplified ( ) 11. What would be the next simplification? Is it efficient? reglbw3 <- update(reglbw2, ~. - lwt) summary(reglbw3) Call: 18
19 lm(formula = bwt ~ race + smoke + ht + ui, data = lbw) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** raceblack ** raceother *** smokeyes *** htyes * uiyes e-05 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 183 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 5 and 183 DF, p-value: 1.98e-08 errs <- rbind(errs, computeerrlm(reglbw3, "Simplified2")) Plot_Err(errs) value variable err errcp errcv errcvup Trivial Full method Simplified Simplified2 Plot_LogLik(errs) 19
20 value 3000 variable LogLik LogLikAIC LogLikBIC 2980 Trivial Full method Simplified Simplified2 Find_Best(errs) err : Full ( ) errcp : Simplified ( ) errcv : Simplified ( ) errcvup : Simplified ( ) LogLik : Full ( ) LogLikAIC : Simplified ( ) LogLikBIC : Simplified ( ) 12. Use glmulti from the package of the same name to test all the possible variable subset without any interaction. (Use the level = 1 option!) What is the best model according to the AIC criterion. library(glmulti) bests <- glmulti(bwt ~., data = lbw, level = 1, family = "gaussian", plotty = FALSE) #You may use plott Initialization... TASK: Exhaustive screening of candidate set. Fitting... After 50 models: Best model: bwt~1+race+smoke Crit= Mean crit= After 100 models: 20
21 Best model: bwt~1+race+smoke+lwt Crit= Mean crit= After 150 models: Best model: bwt~1+race+smoke+ht+lwt Crit= Mean crit= After 200 models: Best model: bwt~1+race+smoke+ui+lwt Crit= Mean crit= After 250 models: Best model: bwt~1+race+smoke+ht+ui Crit= Mean crit= Completed. 13. contains a list of the best models. Use this to compute all the errors for the 25 bests models. Compare those errors with those of our naive attempts. errmulti <- data.frame() for (f in 1:50) { model <- lm(bests@formulas[[f]], data = lbw) errmulti <- rbind(errmulti, computeerrlm(model,sprintf("best_%g",f))) } errs_multi <- rbind(errs, errmulti) Plot_Err(errs_multi) 21
22 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 method value variable err errcp errcv errcvup Plot_LogLik(errs_multi) Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 method value variable LogLik LogLikAIC LogLikBIC 22
23 Find_Best(errmulti) err : Best_9 ( ) errcp : Best_1 ( ) errcv : Best_1 ( ) errcvup : Best_4 ( ) LogLik : Best_9 ( ) LogLikAIC : Best_1 ( ) LogLikBIC : Best_1 ( ) Find_Best(errs_multi) err : Full ( ) errcp : Simplified ( ) errcv : Simplified ( ) errcvup : Best_4 ( ) LogLik : Full ( ) LogLikAIC : Simplified ( ) LogLikBIC : Simplified ( ) 14. Add the interaction of level 2 and use glmulti with method = d to find the number of model. Is the exhaustive search possible? glmulti(bwt ~., data = lbw, level = 2, family = "gaussian", method ="d", plotty = FALSE) #You may use p Initialization... TASK: Diagnostic of candidate set. Sample size: factor(s). 4 covariate(s). 0 f exclusion(s). 0 c exclusion(s). 0 f:f exclusion(s). 0 c:c exclusion(s). 0 f:c exclusion(s). Size constraints: min = 0 max = -1 Complexity constraints: min = 0 max = -1 Your candidate set contains models. [1] Use the genetic algorithm of glmulti (method = g ) to explore those models and examine the best 25 solutions. bestsgen <- glmulti(bwt ~., data = lbw, level = 2, family = "gaussian", method ="g", plotty = FALSE) #Y Initialization... TASK: Genetic algorithm in the candidate set. Initialization... Algorithm started... 23
24 After 10 generations: Best model: bwt~1+race+smoke+ht+ui+age+lwt+ptl+lwt:age+ptl:age+ftv:ptl+smoke:age+smoke:ptl+ht:age+ht: Crit= Mean crit= Change in best IC: / Change in mean IC: After 20 generations: Best model: bwt~1+race+smoke+ui+age+lwt+ptl+ftv+lwt:age+ptl:age+ptl:lwt+ftv:age+smoke:age+smoke:ptl+h Crit= Mean crit= Change in best IC: / Change in mean IC: After 30 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: / Change in mean IC: After 40 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 50 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 60 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 70 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 80 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 90 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC:
25 After 100 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:age+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+u Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 110 generations: Best model: bwt~1+race+smoke+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: / Change in mean IC: After 120 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: / Change in mean IC: After 130 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 140 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 150 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 160 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 170 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 180 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC:
26 After 190 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 200 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 210 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 220 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 230 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 240 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 250 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 260 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 270 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC:
27 After 280 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 290 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 300 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 310 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 320 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 330 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 340 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 350 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 360 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC:
28 After 370 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: 0 After 380 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 390 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 400 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 410 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 420 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 430 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 440 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 450 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC:
29 After 460 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 470 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: 0 After 480 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 490 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 500 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: 0 After 510 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 520 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 530 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 540 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: 0 29
30 After 550 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 560 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Change in best IC: 0 / Change in mean IC: After 570 generations: Best model: bwt~1+race+age+lwt+ptl+ftv+ptl:lwt+ftv:age+smoke:age+ht:age+ht:lwt+ui:lwt+ui:ptl Crit= Mean crit= Improvements in best and average IC have bebingo en below the specified goals. Algorithm is declared to have converged. Completed. errgen <- data.frame() for (f in 1:25) { model <- lm(bestsgen@formulas[[f]], data = lbw) errgen <- rbind(errgen, computeerrlm(model,sprintf("bestgen_%g",f))) } errs_gen <- rbind(errs_multi, errgen) Plot_Err(errs_gen) 30
31 4e+05 5e+05 6e+05 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 method value variable err errcp errcv errcvup Plot_LogLik(errs_gen) Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 method value variable LogLik LogLikAIC LogLikBIC 31
32 Find_Best(errgen) err : BestGen_19 ( ) errcp : BestGen_1 ( ) errcv : BestGen_2 ( ) errcvup : BestGen_2 ( ) LogLik : BestGen_19 ( ) LogLikAIC : BestGen_1 ( ) LogLikBIC : BestGen_2 ( ) Find_Best(errs_gen) err : BestGen_19 ( ) errcp : BestGen_1 ( ) errcv : BestGen_2 ( ) errcvup : BestGen_2 ( ) LogLik : BestGen_19 ( ) LogLikAIC : BestGen_1 ( ) LogLikBIC : Simplified ( ) 16. Use glmnet to try a regularization method to obtain a best model. X <- model.matrix(bwt ~.^2-1, data = lbw) Y <- lbw[["bwt"]] library("glmnet") lbw_lasso <- glmnet(x, Y, family = "gaussian") coeffs_lbw_lasso <- cbind(data.frame(t(as.matrix(coef(lbw_lasso)))), lambda = lbw_lasso[["lambda"]]) ggplot(data = melt(coeffs_lbw_lasso, "lambda"), aes(x = lambda, y = value, color = variable)) + geom_lin 32
33 value lambda age.smokeyes age.ptl age.htyes age.uiyes age.ftv lwt.raceblack lwt.raceother lwt.smokeyes lwt.ptl lwt.htyes lwt.uiyes lwt.ftv raceblack.smokeyes raceother.smokeyes raceblack.ptl raceother.ptl raceblack.htyes raceother.htyes raceblack.uiyes computeerrglmnet <- function(model, lambda, name) { err <- mean((y-predict(model, X, lambda))^2) errcp <- err * ( * (sum(abs(coef(model,lambda))>0)) / nrow(lbw)) errcvtmp <- matrix(0, nrow = 1, ncol = (T*V)) for (v in 1: (T*V)) { Xtrain <- X[LbwFolds[[v]],] Xtest <- X[-LbwFolds[[v]],] Ytrain <- Y[LbwFolds[[v]]] Ytest <- Y[-LbwFolds[[v]]] regtmp <- glmnet(xtrain, Ytrain, family = "gaussian", lambda = lambda) predtmp <- predict(regtmp, Xtest, lambda) errcvtmp[v] <- mean((ytest-predtmp)^2) } errcv <- mean(errcvtmp) errcvup <- errcv + 2 * sd(errcvtmp) / sqrt(t*v) } data.frame( method = name, err = err, errcp = errcp, errcv = errcv, errcvup = errcvup, LogLik = NA, LogLikAIC = NA, LogLikBIC = NA) computeerrlm2 <- function(model, name) { err <- mean((lbwint[["bwt"]]-predict(model))^2) errcp <- err * ( * length(model[["coefficients"]]) / nrow(lbw)) errcvtmp <- matrix(0, nrow = 1, ncol = (T*V)) 33
34 for (v in 1: (T*V)) { lbwtrain <- slice(lbwint, LbwFolds[[v]]) lbwtest <- slice(lbwint, -LbwFolds[[v]]) regtmp <- lm(model, data = lbwtrain) predtmp <- predict(regtmp, newdata = lbwtest) errcvtmp[v] <- mean((lbwtest[["bwt"]]-predtmp)^2) } errcv <- mean(errcvtmp) errcvup <- errcv + 2 * sd(errcvtmp) / sqrt(t*v) LogLik <- -2 * loglik(model) LogLikAIC <- AIC(model) LogLikBIC <- BIC(model) } data.frame( method = name, err = err, errcp = errcp, errcv = errcv, errcvup = errcvup, LogLik = LogLik, LogLikAIC = LogLikAIC, LogLikBIC = LogLikBIC) errlambda <- data.frame() errlambdasup <- data.frame() dx <- data.frame(x) lbwint <- cbind(dx, bwt = Y) for (l in 1:length(lbw_lasso[["lambda"]])) { lambda <- lbw_lasso[["lambda"]][l] errlambda <- rbind(errlambda, computeerrglmnet(lbw_lasso, lambda, sprintf("lasso_%g",l))) subsetlambda <- which(abs(coef(lbw_lasso,lambda)[-1]) > 0) if (length(subsetlambda)>0) { reglambda <- lm(bwt ~., data = mutate(select(dx, subsetlambda), bwt = Y)) errlambdasup <- rbind(errlambdasup, computeerrlm2(reglambda, sprintf("lassosup_%g",l))) } } errs_lasso <- rbind(errs_gen, errlambda, errlambdasup) Plot_Err(errs_lasso) 34
35 0e+00 1e+07 2e+07 3e+07 Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 Lasso_1 Lasso_2 Lasso_3 Lasso_4 Lasso_5 Lasso_6 Lasso_7 Lasso_8 Lasso_9 Lasso_10 Lasso_11 Lasso_12 Lasso_13 Lasso_14 Lasso_15 Lasso_16 Lasso_17 Lasso_18 Lasso_19 Lasso_20 Lasso_21 Lasso_22 Lasso_23 Lasso_24 Lasso_25 Lasso_26 Lasso_27 Lasso_28 Lasso_29 Lasso_30 Lasso_31 Lasso_32 Lasso_33 Lasso_34 Lasso_35 Lasso_36 Lasso_37 Lasso_38 Lasso_39 Lasso_40 Lasso_41 Lasso_42 Lasso_43 Lasso_44 Lasso_45 Lasso_46 Lasso_47 Lasso_48 Lasso_49 Lasso_50 Lasso_51 Lasso_52 Lasso_53 Lasso_54 Lasso_55 Lasso_56 Lasso_57 Lasso_58 Lasso_59 Lasso_60 Lasso_61 Lasso_62 Lasso_63 Lasso_64 Lasso_65 Lasso_66 Lasso_67 Lasso_68 Lasso_69 Lasso_70 Lasso_71 Lasso_72 Lasso_73 Lasso_74 Lasso_75 Lasso_76 Lasso_77 Lasso_78 Lasso_79 Lasso_80 Lasso_81 Lasso_82 Lasso_83 Lasso_84 Lasso_85 Lasso_86 Lasso_87 Lasso_88 Lasso_89 Lasso_90 Lasso_91 Lasso_92 Lasso_93 LassoSup_2 LassoSup_3 LassoSup_4 LassoSup_5 LassoSup_6 LassoSup_7 LassoSup_8 LassoSup_9 LassoSup_10 LassoSup_11 LassoSup_12 LassoSup_13 LassoSup_14 LassoSup_15 LassoSup_16 LassoSup_17 LassoSup_18 LassoSup_19 LassoSup_20 LassoSup_21 LassoSup_22 LassoSup_23 LassoSup_24 LassoSup_25 LassoSup_26 LassoSup_27 LassoSup_28 LassoSup_29 LassoSup_30 LassoSup_31 LassoSup_32 LassoSup_33 LassoSup_34 LassoSup_35 LassoSup_36 LassoSup_37 LassoSup_38 LassoSup_39 LassoSup_40 LassoSup_41 LassoSup_42 LassoSup_43 LassoSup_44 LassoSup_45 LassoSup_46 LassoSup_47 LassoSup_48 LassoSup_49 LassoSup_50 LassoSup_51 LassoSup_52 LassoSup_53 LassoSup_54 LassoSup_55 LassoSup_56 LassoSup_57 LassoSup_58 LassoSup_59 LassoSup_60 LassoSup_61 LassoSup_62 LassoSup_63 LassoSup_64 LassoSup_65 LassoSup_66 LassoSup_67 LassoSup_68 LassoSup_69 LassoSup_70 LassoSup_71 LassoSup_72 LassoSup_73 LassoSup_74 LassoSup_75 LassoSup_76 LassoSup_77 LassoSup_78 LassoSup_79 LassoSup_80 LassoSup_81 LassoSup_82 LassoSup_83 LassoSup_84 LassoSup_85 LassoSup_86 LassoSup_87 LassoSup_88 LassoSup_89 LassoSup_90 LassoSup_91 LassoSup_92 LassoSup_93 method value variable err errcp errcv errcvup Plot_LogLik(errs_lasso) Trivial Full Simplified Simplified2 Best_1 Best_2 Best_3 Best_4 Best_5 Best_6 Best_7 Best_8 Best_9 Best_10 Best_11 Best_12 Best_13 Best_14 Best_15 Best_16 Best_17 Best_18 Best_19 Best_20 Best_21 Best_22 Best_23 Best_24 Best_25 Best_26 Best_27 Best_28 Best_29 Best_30 Best_31 Best_32 Best_33 Best_34 Best_35 Best_36 Best_37 Best_38 Best_39 Best_40 Best_41 Best_42 Best_43 Best_44 Best_45 Best_46 Best_47 Best_48 Best_49 Best_50 BestGen_1 BestGen_2 BestGen_3 BestGen_4 BestGen_5 BestGen_6 BestGen_7 BestGen_8 BestGen_9 BestGen_10 BestGen_11 BestGen_12 BestGen_13 BestGen_14 BestGen_15 BestGen_16 BestGen_17 BestGen_18 BestGen_19 BestGen_20 BestGen_21 BestGen_22 BestGen_23 BestGen_24 BestGen_25 Lasso_1 Lasso_2 Lasso_3 Lasso_4 Lasso_5 Lasso_6 Lasso_7 Lasso_8 Lasso_9 Lasso_10 Lasso_11 Lasso_12 Lasso_13 Lasso_14 Lasso_15 Lasso_16 Lasso_17 Lasso_18 Lasso_19 Lasso_20 Lasso_21 Lasso_22 Lasso_23 Lasso_24 Lasso_25 Lasso_26 Lasso_27 Lasso_28 Lasso_29 Lasso_30 Lasso_31 Lasso_32 Lasso_33 Lasso_34 Lasso_35 Lasso_36 Lasso_37 Lasso_38 Lasso_39 Lasso_40 Lasso_41 Lasso_42 Lasso_43 Lasso_44 Lasso_45 Lasso_46 Lasso_47 Lasso_48 Lasso_49 Lasso_50 Lasso_51 Lasso_52 Lasso_53 Lasso_54 Lasso_55 Lasso_56 Lasso_57 Lasso_58 Lasso_59 Lasso_60 Lasso_61 Lasso_62 Lasso_63 Lasso_64 Lasso_65 Lasso_66 Lasso_67 Lasso_68 Lasso_69 Lasso_70 Lasso_71 Lasso_72 Lasso_73 Lasso_74 Lasso_75 Lasso_76 Lasso_77 Lasso_78 Lasso_79 Lasso_80 Lasso_81 Lasso_82 Lasso_83 Lasso_84 Lasso_85 Lasso_86 Lasso_87 Lasso_88 Lasso_89 Lasso_90 Lasso_91 Lasso_92 Lasso_93 LassoSup_2 LassoSup_3 LassoSup_4 LassoSup_5 LassoSup_6 LassoSup_7 LassoSup_8 LassoSup_9 LassoSup_10 LassoSup_11 LassoSup_12 LassoSup_13 LassoSup_14 LassoSup_15 LassoSup_16 LassoSup_17 LassoSup_18 LassoSup_19 LassoSup_20 LassoSup_21 LassoSup_22 LassoSup_23 LassoSup_24 LassoSup_25 LassoSup_26 LassoSup_27 LassoSup_28 LassoSup_29 LassoSup_30 LassoSup_31 LassoSup_32 LassoSup_33 LassoSup_34 LassoSup_35 LassoSup_36 LassoSup_37 LassoSup_38 LassoSup_39 LassoSup_40 LassoSup_41 LassoSup_42 LassoSup_43 LassoSup_44 LassoSup_45 LassoSup_46 LassoSup_47 LassoSup_48 LassoSup_49 LassoSup_50 LassoSup_51 LassoSup_52 LassoSup_53 LassoSup_54 LassoSup_55 LassoSup_56 LassoSup_57 LassoSup_58 LassoSup_59 LassoSup_60 LassoSup_61 LassoSup_62 LassoSup_63 LassoSup_64 LassoSup_65 LassoSup_66 LassoSup_67 LassoSup_68 LassoSup_69 LassoSup_70 LassoSup_71 LassoSup_72 LassoSup_73 LassoSup_74 LassoSup_75 LassoSup_76 LassoSup_77 LassoSup_78 LassoSup_79 LassoSup_80 LassoSup_81 LassoSup_82 LassoSup_83 LassoSup_84 LassoSup_85 LassoSup_86 LassoSup_87 LassoSup_88 LassoSup_89 LassoSup_90 LassoSup_91 LassoSup_92 LassoSup_93 method value variable LogLik LogLikAIC LogLikBIC 35
36 Find_Best(errlambdasup) err : LassoSup_54 ( ) errcp : LassoSup_7 ( ) errcv : LassoSup_7 ( ) errcvup : LassoSup_7 ( ) LogLik : LassoSup_54 ( ) LogLikAIC : LassoSup_7 ( ) LogLikBIC : LassoSup_5 ( ) Find_Best(errs_lasso) err : LassoSup_54 ( ) errcp : BestGen_1 ( ) errcv : BestGen_2 ( ) errcvup : BestGen_2 ( ) LogLik : LassoSup_54 ( ) LogLikAIC : BestGen_1 ( ) LogLikBIC : LassoSup_5 ( ) 17. Find a better model... 36
Navigate to the golf data folder and make it your working directory. Load the data by typing
Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better
More informationAnnouncements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income
Announcements Announcements Unit 7: Multiple Linear Regression Lecture 3: Case Study Statistics 101 Mine Çetinkaya-Rundel April 18, 2013 OH: Sunday: Virtual OH, 3-4pm - you ll receive an email invitation
More informationStatistical Analyses on Roger Federer s Performances in 2013 and 2014 James Kong,
Statistical Analyses on Roger Federer s Performances in 2013 and 2014 James Kong, kong.james@berkeley.edu Introduction Tennis has become a global sport and I am a big fan of tennis. Since I played college
More informationPitching Performance and Age
Pitching Performance and Age By: Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector, Will Kunin Introduction April 13, 2016 Many of the oldest players and players with the most longevity of the
More informationMinimal influence of wind and tidal height on underwater noise in Haro Strait
Minimal influence of wind and tidal height on underwater noise in Haro Strait Introduction Scott Veirs, Beam Reach Val Veirs, Colorado College December 2, 2007 Assessing the effect of wind and currents
More informationEl Cerrito Sporting Goods Ira Sharenow January 7, 2019
El Cerrito Sporting Goods Ira Sharenow January 7, 2019 R Markdown The goal of the analysis is to determine if any of the salespersons are performing exceptionally well or exceptionally poorly. In particular,
More informationUnited States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic,
SEDAR19-DW-14 United States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic, 1993-2008 Kevin McCarthy and Neil Baertlein National Marine Fisheries Service,
More informationAnnouncements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.
Announcements Announcements UNIT 7: MULTIPLE LINEAR REGRESSION LECTURE 1: INTRODUCTION TO MLR STATISTICS 101 Problem Set 10 Due Wednesday Nicole Dalzell June 15, 2015 Statistics 101 (Nicole Dalzell) U7
More informationPitching Performance and Age
Pitching Performance and Age Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector and Will Kunin Introduction April 13, 2016 Many of the oldest and most long- term players of the game are pitchers.
More informationASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era. Gary Evans Stat 201B Winter, 2010
ASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era by Gary Evans Stat 201B Winter, 2010 Introduction: After a playerʼs strike in 1994 which resulted
More informationMeasuring Batting Performance
Measuring Batting Performance Authors: Samantha Attar, Hannah Dineen, Andy Fullerton, Nora Hanson, Cam Kelso, Katie McLaughlin, and Caitlyn Nolan Introduction: The following analysis compares slugging
More informationMidterm Exam 1, section 2. Thursday, September hour, 15 minutes
San Francisco State University Michael Bar ECON 312 Fall 2018 Midterm Exam 1, section 2 Thursday, September 27 1 hour, 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can
More informationEvaluating and Classifying NBA Free Agents
Evaluating and Classifying NBA Free Agents Shanwei Yan In this project, I applied machine learning techniques to perform multiclass classification on free agents by using game statistics, which is useful
More informationAddendum to SEDAR16-DW-22
Addendum to SEDAR16-DW-22 Introduction Six king mackerel indices of abundance, two for each region Gulf of Mexico, South Atlantic, and Mixing Zone, were constructed for the SEDAR16 data workshop using
More informationUSING DELTA-GAMMA GENERALIZED LINEAR MODELS TO STANDARDIZE CATCH RATES OF YELLOWFIN TUNA CAUGHT BY BRAZILIAN BAIT-BOATS
SCRS/2008/166 USING DELTA-GAMMA GENERALIZED LINEAR MODELS TO STANDARDIZE CATCH RATES OF YELLOWFIN TUNA CAUGHT BY BRAZILIAN BAIT-BOATS Humber A. Andrade 1 SUMMARY In order to standardize catch per unit
More informationStandardized catch rates of yellowtail snapper ( Ocyurus chrysurus
Standardized catch rates of yellowtail snapper (Ocyurus chrysurus) from the Marine Recreational Fisheries Statistics Survey in south Florida, 1981-2010 Introduction Yellowtail snapper are caught by recreational
More informationAnnouncements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions
Announcements Announcements Lecture 19: Inference for SLR & Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner will be announced
More informationEmpirical Example II of Chapter 7
Empirical Example II of Chapter 7 1. We use NBA data. The description of variables is --- --- --- storage display value variable name type format label variable label marr byte %9.2f =1 if married wage
More informationLab 11: Introduction to Linear Regression
Lab 11: Introduction to Linear Regression Batter up The movie Moneyball focuses on the quest for the secret of success in baseball. It follows a low-budget team, the Oakland Athletics, who believed that
More informationCami T. McCandless and Joesph J. Mello SEDAR39- DW June 2014
Standardized indices of abundance for Smooth Dogfish, Mustelus canis, from the Northeast Fisheries Observer Program Cami T. McCandless and Joesph J. Mello SEDAR39- DW- 09 30 June 2014 This information
More informationSEDAR52-WP November 2017
Using a Censored Regression Modeling Approach to Standardize Red Snapper Catch per Unit Effort Using Recreational Fishery Data Affected by a Bag Limit Skyler Sagarese and Adyan Rios SEDAR52-WP-13 15 November
More informationISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide
ISDS 4141 Sample Data Mining Work Taylor C. Veillon Tool Used: SAS Enterprise Guide You may have seen the movie, Moneyball, about the Oakland A s baseball team and general manager, Billy Beane, who focused
More informationSpecial Topics: Data Science
Special Topics: Data Science L Linear Methods for Prediction Dr. Vidhyasaharan Sethu School of Electrical Engineering & Telecommunications University of New South Wales Sydney, Australia V. Sethu 1 Topics
More informationCase Studies Homework 3
Case Studies Homework 3 Breanne Chryst September 11, 2013 1 In this assignment I did some exploratory analysis on a data set containing diving information from the 2000 Olympics. My code and output is
More informationROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher
Mini-project 3 Tennis ball launcher Mini-Project 3 requires you to use MATLAB to model the trajectory of a tennis ball being shot from a tennis ball launcher to a player. The tennis ball trajectory model
More informationy ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together
Statistics 111 - Lecture 7 Exploring Data Numerical Summaries for Relationships between Variables Administrative Notes Homework 1 due in recitation: Friday, Feb. 5 Homework 2 now posted on course website:
More informationA Shallow Dive into Deep Sea Data Sarah Solie and Arielle Fogel 7/18/2018
A Shallow Dive into Deep Sea Data Sarah Solie and Arielle Fogel 7/18/2018 Introduction The datasets This data expedition will utilize the World Ocean Atlas (WOA) database to explore two deep sea physical
More informationSPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction
A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS KELLIN RUMSEY Introduction The 2016 National Basketball Association championship featured two of the leagues biggest names. The Golden State Warriors Stephen
More informationCopy of my report. Why am I giving this talk. Overview. State highway network
Road Surface characteristics and traffic accident rates on New Zealand s state highway network Robert Davies Statistics Research Associates http://www.statsresearch.co.nz Copy of my report There is a copy
More informationPredictors for Winning in Men s Professional Tennis
Predictors for Winning in Men s Professional Tennis Abstract In this project, we use logistic regression, combined with AIC and BIC criteria, to find an optimal model in R for predicting the outcome of
More informationLecture 5. Optimisation. Regularisation
Lecture 5. Optimisation. Regularisation COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne Iterative optimisation Loss functions Coordinate
More informationStandardized catch rates of Atlantic king mackerel (Scomberomorus cavalla) from the North Carolina Commercial fisheries trip ticket.
SEDAR16 DW 11 Standardized catch rates of Atlantic king mackerel (Scomberomorus cavalla) from the North Carolina Commercial fisheries trip ticket. Alan Bianchi 1 and Mauricio Ortiz 2 SUMMARY Standardized
More informationBBS Fall Conference, 16 September Use of modeling & simulation to support the design and analysis of a new dose and regimen finding study
BBS Fall Conference, 16 September 211 Use of modeling & simulation to support the design and analysis of a new dose and regimen finding study Didier Renard Background (1) Small molecule delivered by lung
More informationANOVA - Implementation.
ANOVA - Implementation http://www.pelagicos.net/classes_biometry_fa17.htm Doing an ANOVA With RCmdr Categorical Variable One-Way ANOVA Testing a single Factor dose with 3 treatments (low, mid, high) Doing
More informationSample Final Exam MAT 128/SOC 251, Spring 2018
Sample Final Exam MAT 128/SOC 251, Spring 2018 Name: Each question is worth 10 points. You are allowed one 8 1/2 x 11 sheet of paper with hand-written notes on both sides. 1. The CSV file citieshistpop.csv
More informationUpdated and revised standardized catch rate of blue sharks caught by the Taiwanese longline fishery in the Indian Ocean
Updated and revised standardized catch rate of blue sharks caught by the Taiwanese longline fishery in the Indian Ocean Wen-Pei Tsai 1,3 and Kwang-Ming Liu 2 1 Department of Fisheries Production and Management,
More informationDevelopment of Decision Support Tools to Assess Pedestrian and Bicycle Safety: Development of Safety Performance Function
Development of Decision Support Tools to Assess Pedestrian and Bicycle Safety: Development of Safety Performance Function Valerian Kwigizile, Jun Oh, Ron Van Houten, & Keneth Kwayu INTRODUCTION 2 OVERVIEW
More informationUnit 4: Inference for numerical variables Lecture 3: ANOVA
Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June 10, 2013 Announcements Announcements Proposals due tomorrow. Will be returned to you by Wednesday. You MUST
More informationModelling Exposure at Default Without Conversion Factors for Revolving Facilities
Modelling Exposure at Default Without Conversion Factors for Revolving Facilities Mark Thackham Credit Scoring and Credit Control XV, Edinburgh, August 2017 1 / 27 Objective The objective of this presentation
More informationPredicting the use of the Sacrifice Bunt in Major League Baseball. Charlie Gallagher Brian Gilbert Neelay Mehta Chao Rao
Predicting the use of the Sacrifice Bunt in Major League Baseball Charlie Gallagher Brian Gilbert Neelay Mehta Chao Rao Understanding the Data Data from the St. Louis Cardinals Sig Mejdal, Senior Quantitative
More informationFactorial Analysis of Variance
Factorial Analysis of Variance Overview of the Factorial ANOVA Factorial ANOVA (Two-Way) In the context of ANOVA, an independent variable (or a quasiindependent variable) is called a factor, and research
More informationNFL Direction-Oriented Rushing O -Def Plus-Minus
NFL Direction-Oriented Rushing O -Def Plus-Minus ID: 6289 In football, rushing is an action of advancing the ball forward by running with it, instead of passing. Rush o ense refers to how well a team is
More informationPRESENTS. Solder & Oven Profiles Critical Process Variables
PRESENTS ubga s Solder & Oven Profiles Critical Process Variables Phone: 949.713.7229 Fax: 949.713.7229 For Information Please Call or E-mail palsrvs@palsrvs.com Member of: PUBLISHED BY Pan Pacific Microelectronics
More informationSystematic Review and Meta-analysis of Bicycle Helmet Efficacy to Mitigate Head, Face and Neck Injuries
Systematic Review and Meta-analysis of Bicycle Helmet Efficacy to Mitigate Head, Face and Neck Injuries Prudence Creighton & Jake Olivier MATHEMATICS & THE UNIVERSITY OF NEW STATISTICS SOUTH WALES Creighton
More informationMining and Agricultural Productivity
Mining and Agricultural Productivity - A presentation of preliminary results for Ghana The World Bank, Washington D.C., USA, 29 May 214 MAGNUS ANDERSSON, Senior Lecturer in Human Geography, Lund University,
More informationReproducible Research: Peer Assessment 1
Introduction Reproducible Research: Peer Assessment 1 It is now possible to collect a large amount of data about personal movement using activity monitoring devices such as a Fitbit, Nike Fuelband, or
More informationBiomechanics and Models of Locomotion
Physics-Based Models for People Tracking: Biomechanics and Models of Locomotion Marcus Brubaker 1 Leonid Sigal 1,2 David J Fleet 1 1 University of Toronto 2 Disney Research, Pittsburgh Biomechanics Biomechanics
More informationStatistical and Econometric Methods for Transportation Data Analysis
Statistical and Econometric Methods for Transportation Data Analysis Chapter 13 Discrete Outcome Models Example 13.1b Discrete Outcome Data FIML Nested Logit I As in assignments 13-1 and 13-2, you are
More informationSession 2: Introduction to Multilevel Modeling Using SPSS
Session 2: Introduction to Multilevel Modeling Using SPSS Exercise 1 Description of Data: exerc1 This is a dataset from Kasia Kordas s research. It is data collected on 457 children clustered in schools.
More informationSensing and Modeling of Terrain Features using Crawling Robots
Czech Technical University in Prague Sensing and Modeling of Terrain Features using Crawling Robots Jakub Mrva 1 Faculty of Electrical Engineering Agent Technology Center Computational Robotics Laboratory
More informationFootball Player s Performance and Market Value
Football Player s Performance and Market Value Miao He 1, Ricardo Cachucho 1, and Arno Knobbe 1,2 1 LIACS, Leiden University, the Netherlands, r.cachucho@liacs.leidenuniv.nl 2 Amsterdam University of Applied
More informationSupporting Online Material for
Originally posted 16 September 2011; corrected 18 April 2012 www.sciencemag.org/cgi/content/full/333/6049/1627/dc1 Supporting Online Material for Faking Giants: The Evolution of High Prey Clearance Rates
More informationEvaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie
Evaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie QFC Technical Report T2010-01 Prepared for: Ohio Department of Natural
More informationSTAT 625: 2000 Olympic Diving Exploration
Corey S Brier, Department of Statistics, Yale University 1 STAT 625: 2000 Olympic Diving Exploration Corey S Brier Yale University Abstract This document contains a preliminary investigation of data from
More informationData Set 7: Bioerosion by Parrotfish Background volume of bites The question:
Data Set 7: Bioerosion by Parrotfish Background Bioerosion of coral reefs results from animals taking bites out of the calcium-carbonate skeleton of the reef. Parrotfishes are major bioerosion agents,
More informationdplyr & Functions stat 480 Heike Hofmann
dplyr & Functions stat 480 Heike Hofmann Outline dplyr functions and package Functions library(dplyr) data(baseball, package= plyr )) Your Turn Use data(baseball, package="plyr") to make the baseball dataset
More informationEXST7015: Salaries of all American league baseball players (1994) Salaries in thousands of dollars RAW DATA LISTING
ANOVA & Design Randomized Block Design Page 1 1 **EXAMPLE 1******************************************************; 2 *** The 1994 salaries of all American league baseball players ***; 3 *** as reported
More informationBayesian model averaging with change points to assess the impact of vaccination and public health interventions
Bayesian model averaging with change points to assess the impact of vaccination and public health interventions SUPPLEMENTARY METHODS Data sources U.S. hospitalization data were obtained from the Healthcare
More informationLegendre et al Appendices and Supplements, p. 1
Legendre et al. 2010 Appendices and Supplements, p. 1 Appendices and Supplement to: Legendre, P., M. De Cáceres, and D. Borcard. 2010. Community surveys through space and time: testing the space-time interaction
More informationCase Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%
GET FILE='C:\Users\acantrell\Desktop\demo5.sav'. DATASET NAME DataSet1 WINDOW=FRONT. EXAMINE VARIABLES=PASSYDSPG RUSHYDSPG /PLOT BOXPLOT HISTOGRAM /COMPARE GROUPS /STATISTICS DESCRIPTIVES /CINTERVAL 95
More informationSports Predictive Analytics: NFL Prediction Model
Sports Predictive Analytics: NFL Prediction Model By Dr. Ash Pahwa IEEE Computer Society San Diego Chapter January 17, 2017 Copyright 2017 Dr. Ash Pahwa 1 Outline Case Studies of Sports Analytics Sports
More informationNew Technology used in sports. By:- ABKASH AGARWAL REGD NO BRANCH CSE(A)
New Technology used in sports By:- ABKASH AGARWAL REGD NO-0901209326 BRANCH CSE(A) 1)Introduction 2)Abilities 3)Principle of HAWK EYE 4)Technology 5)Applications 6)Further Developments 7)Conclusion 8)References
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM Nicholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looked at -means and hierarchical clustering as mechanisms for unsupervised learning
More informationPredicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007
Predicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007 Group 6 Charles Gallagher Brian Gilbert Neelay Mehta Chao Rao Executive Summary Background When a runner is on-base
More informationTrouble With The Curve: Improving MLB Pitch Classification
Trouble With The Curve: Improving MLB Pitch Classification Michael A. Pane Samuel L. Ventura Rebecca C. Steorts A.C. Thomas arxiv:134.1756v1 [stat.ap] 5 Apr 213 April 8, 213 Abstract The PITCHf/x database
More informationRemote Towers: Videopanorama Framerate Requirements Derived from Visual Discrimination of Deceleration During Simulated Aircraft Landing
www.dlr.de Chart 1 > SESARInno > Fürstenau RTOFramerate> 2012-11-30 Remote Towers: Videopanorama Framerate Requirements Derived from Visual Discrimination of Deceleration During Simulated Aircraft Landing
More informationOnline Diagnosis of Engine Dyno Test Benches: A Possibilistic Approach
Online Diagnosis of Engine Dyno Test Benches: A Possibilistic Approach S. Boverie (d), D. Dubois (c), X. Guérandel (a), O. de Mouzon (b), H. Prade (c) Engine dyno diagnostic BEST project Bench Expert System
More informationIs lung capacity affected by smoking, sport, height or gender. Table of contents
Sample project This Maths Studies project has been graded by a moderator. As you read through it, you will see comments from the moderator in boxes like this: At the end of the sample project is a summary
More informationPREDICTING the outcomes of sporting events
CS 229 FINAL PROJECT, AUTUMN 2014 1 Predicting National Basketball Association Winners Jasper Lin, Logan Short, and Vishnu Sundaresan Abstract We used National Basketball Associations box scores from 1991-1998
More informationTitle: 4-Way-Stop Wait-Time Prediction Group members (1): David Held
Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held As part of my research in Sebastian Thrun's autonomous driving team, my goal is to predict the wait-time for a car at a 4-way intersection.
More informationChapter 12 Practice Test
Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value
More informationBASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG
BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG GOAL OF PROJECT The goal is to predict the winners between college men s basketball teams competing in the 2018 (NCAA) s March
More informationPackage mrchmadness. April 9, 2017
Package mrchmadness April 9, 2017 Title Numerical Tools for Filling Out an NCAA Basketball Tournament Bracket Version 1.0.0 URL https://github.com/elishayer/mrchmadness Imports dplyr, glmnet, Matrix, rvest,
More informationGLMM standardisation of the commercial abalone CPUE for Zones A-D over the period
GLMM standardisation of the commercial abalone for Zones A-D over the period 1980 2015 Anabela Brandão and Doug S. Butterworth Marine Resource Assessment & Management Group (MARAM) Department of Mathematics
More informationPaua research diver survey: review of data collected and simulation study of survey method
Paua research diver survey: review of data collected and simulation study of survey method V. Haist Haist Consulting 6 Marina Way Nanoose Bay, BC Canada New Zealand Fisheries Assessment Report 00/8 November
More informationStandardized catch rates of U.S. blueline tilefish (Caulolatilus microps) from commercial logbook longline data
Standardized catch rates of U.S. blueline tilefish (Caulolatilus microps) from commercial logbook longline data Sustainable Fisheries Branch, National Marine Fisheries Service, Southeast Fisheries Science
More informationISyE 6414 Regression Analysis
ISyE 6414 Regression Analysis Lecture 2: More Simple linear Regression: R-squared (coefficient of variation/determination) Correlation analysis: Pearson s correlation Spearman s rank correlation Variable
More information1 PIPESYS Application
PIPESYS Application 1-1 1 PIPESYS Application 1.1 Gas Condensate Gathering System In this PIPESYS Application, the performance of a small gascondensate gathering system is modelled. Figure 1.1 shows the
More informationA computer program that improves its performance at some task through experience.
1 A computer program that improves its performance at some task through experience. 2 Example: Learn to Diagnose Patients T: Diagnose tumors from images P: Percent of patients correctly diagnosed E: Pre
More informationConstantinos Antoniou and Konstantinos Papoutsis
Investigation of Greek driver behavior during the approach to suburban un-signalized intersections Constantinos Antoniou and Konstantinos Papoutsis National Technical University of Athens School of Rural
More informationHellgate 100k Race Analysis February 12, 2015
Hellgate 100k Race Analysis brockwebb45@gmail.com February 12, 2015 Synopsis The Hellgate 100k is a tough, but rewarding race directed by Dr. David Horton. Taking place around the second week of December
More informationCS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan
CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan Scenario 1: Team 1 scored 200 runs from their 50 overs, and then Team 2 reaches 146 for the loss of two wickets from their
More informationOne-factor ANOVA by example
ANOVA One-factor ANOVA by example 2 One-factor ANOVA by visual inspection 3 4 One-factor ANOVA H 0 H 0 : µ 1 = µ 2 = µ 3 = H A : not all means are equal 5 One-factor ANOVA but why not t-tests t-tests?
More informationReshaping data in R. Duncan Golicher. December 9, 2008
Reshaping data in R Duncan Golicher December 9, 2008 One of the most frustrating and time consuming parts of statistical analysis is shuffling data into a format for analysis. No one enjoys changing data
More informationCONTRADICTORY CATCH RATES OF BLUE SHARK CAUGHT IN ATLANTIC OCEAN BY BRAZILIAN LONG-LINE FLEET AS ESTIMATED USING GENERALIZED LINEAR MODELS
SCRS/2008/132 CONTRADICTORY CATCH RATES OF BLUE SHARK CAUGHT IN ATLANTIC OCEAN BY BRAZILIAN LONG-LINE FLEET AS ESTIMATED USING GENERALIZED LINEAR MODELS Humber A. Andrade 1 SUMMARY Brazilian long-line
More informationMultilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11
Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11 Study Overview: These data come from a daily diary study that followed 41 male and female college students over a six-week period to examine
More informationNaïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Naïve Bayes These slides were assembled by Byron Boots, with only minor modifications from Eric Eaton s slides and grateful acknowledgement to the many others who made their course materials freely available
More informationknn & Naïve Bayes Hongning Wang
knn & Naïve Bayes Hongning Wang CS@UVa Today s lecture Instance-based classifiers k nearest neighbors Non-parametric learning algorithm Model-based classifiers Naïve Bayes classifier A generative model
More informationTHE INTEGRATION OF THE SEA BREAM AND SEA BASS MARKET: EVIDENCE FROM GREECE AND SPAIN
THE INTEGRATION OF THE SEA BREAM AND SEA BASS MARKET: EVIDENCE FROM GREECE AND SPAIN Lamprakis Avdelas, Managing Authority for the Operational programme for Fisheries, lamprakisa@in.gr Jordi Guillen, University
More informationPreparation for Salinity Control ME 121
Preparation for Salinity Control ME 121 This document describes a set of measurements and analyses that will help you to write an Arduino program to control the salinity of water in your fish tank. The
More informationQuantitative Methods for Economics Tutorial 6. Katherine Eyal
Quantitative Methods for Economics Tutorial 6 Katherine Eyal TUTORIAL 6 13 September 2010 ECO3021S Part A: Problems 1. (a) In 1857, the German statistician Ernst Engel formulated his famous law: Households
More informationChapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19
Chapter 13 Factorial ANOVA Patrick Mair 2015 Psych 1950 13 Factorial ANOVA 0 / 19 Today s Menu Now we extend our one-way ANOVA approach to two (or more) factors. Factorial ANOVA: two-way ANOVA, SS decomposition,
More informationDriver Behavior at Highway-Rail Grade Crossings With Passive Traffic Controls
2014 Global Level Crossing Symposium August 2014, Urbana, IL, USA Driver Behavior at Highway-Rail Grade Crossings With Passive Traffic Controls - A Driving Simulator Study Presenter: Dr. Asad J. Khattak
More informationNovel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils
86 Pet.Sci.(29)6:86-9 DOI 1.17/s12182-9-16-x Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils Ehsan Khamehchi 1, Fariborz Rashidi
More informationClass 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:
Slide 1 Chapter 13: ANOVA for 2-way classifications (2 of 2) Fixed and Random factors, Model I, Model II, and Model III (mixed model) ANOVA Chapter 14: Unreplicated Factorial & Nested Designs Slide 2 HW
More informationModelling residential prices with cointegration techniques and automatic selection algorithms
Modelling residential prices with cointegration techniques and automatic selection algorithms Ramiro J. Rodríguez A presentation for ERES 2014 - PhD Sessions Bucharest, Rumania The opinions and analyses
More informationACCIDENT MODIFICATION FACTORS FOR MEDIANS ON FREEWAYS AND MULTILANE RURAL HIGHWAYS IN TEXAS
Fitzpatrick, Lord, Park ACCIDENT MODIFICATION FACTORS FOR MEDIANS ON FREEWAYS AND MULTILANE RURAL HIGHWAYS IN TEXAS Kay Fitzpatrick Senior Research Engineer Texas Transportation Institute, 335 TAMU College
More informationStat 5100 Handout #27 SAS: Variations on Ordinary Least Squares (LASSO and Elastic Net)
Stat 5100 Handout #27 SAS: Variations on Ordinary Least Squares (LASSO and Elastic Net) Example: (Baseball) This data set (from the SAS Help) contains salary (for 1987) and performance (1986 and some career)
More informationRobust Imputation of Missing Values in Compositional Data Using the R-Package robcompositions
Robust Imputation of Missing Values in Compositional Data Using the R-Package robcompositions Matthias Templ 1,2, Peter Filzmoser 1, Karel Hron 3 1 Department of Statistics and Probability Theory, TU WIEN,
More informationCPUE standardization of black marlin (Makaira indica) caught by Taiwanese large scale longline fishery in the Indian Ocean
CPUE standardization of black marlin (Makaira indica) caught by Taiwanese large scale longline fishery in the Indian Ocean Sheng-Ping Wang Department of Environmental Biology and Fisheries Science, National
More information