Select Page
  

Hi, I am seeking someone to help to fulfill all the questions of this project 2, which involves some R codes. Please clearly read the assignment requirements, and the due date is 3 days later. Meanwhile, I will post some R materials if it needed. Lastly, the project should be welly organized and fully answered. Thanks
sta_138_winter_2019_project_ii.pdf

prostate.csv.xls

Don't use plagiarized sources. Get Your Custom Essay on
STA138 week 7 UC Davis Multiple Logistic Regression problems
Just from $10/Page
Order Essay

angina.csv.xls

sta_138_week_07_model_selection_logistic_regression.rmd.txt

sta_138_week_06_multiple_logistic_regression_and_diagnostics.rmd.txt

Unformatted Attachment Preview

STA 138 Exam II Project, due
Friday, March 8th in lecture
Read the following instructions carefully:
• You may work in a group of two, or by yourself.
• You are not allowed to discuss the questions with anyone other than the instructor or TA and
your group mate.
• Any outside help beyond that from the instructor or TA is considered plagiarism. This including asking a tutor, your classmates (for example, comparing answers), posting the questions
to homework help sites, etc. Should we believe you have sought outside help, you will be
reported to the Student Judicial Affairs office.
• You are allowed to use or modify your previous functions, or the instructors functions that
are posted online.
• Do not share answers, or specific values for calculations, particularly on Piazza.
• You may ask clarifying questions about code and general approach on Piazza, but do not give away any
numerical answers. If you are concerned you may be giving something away, email me or the TA’s directly.
Multiple Logistic Regression
Choose one and only one of the following:
Problem I
We will be using the dataset online called prostate.csv. The rows contain information from patients who are
being assessed for prostate cancer. The variables included are:
• y: Indicator of prostate cancer diagnosis (1) or no cancer diagnosis (0)
• psa: Serum prostate-specific antigen level (mg/ml)
• c.vol : Estimate of prostate cancer volume (cc)
• weight: Prostate weight (gm)
• age: Age of patient (years)
• benign: Amount of benign prostatic hyperplasia (cm2 )
• inv: Presence (”invasion”) or absence (”no-invasion”) of seminal vesicle invasion.
• cap: Degree of capsular penatration (cm)
The goal of this problem is to build a model where Y = prostate cancer status.
Problem II
We will be using the dataset online called angina.csv. This study was conducted in order to assess the relationship between various predictor variables, and if the subject had angina (a condition where heart muscles receive
insufficient oxygen-rich blood). The data follows:
Column 1: y: 1 if angina present, 0 if absent
Column 2: age: The age of the subject in years
Column 3: smoke: With values current, ex, or never (indicating smoking history).
Column 4: cig: The average number of cigarettes smoked per day
Column 5: hyper: History of hypertension in the family, with values absent, mild, moderate.
Column 6: myofam: History of myocardial infarction in the family, with values yes, no.
Column 7: strokefam: History of stroke in the family, with values yes, no.
Column 8: diabetes: History of diabetes in the family, with values yes, no.
The goal of this problem is to build a model where Y = angina status.
2
The Report Format
This should be a report. This means you write in full sentences, and have the following sections for each question,
while being as specific as you can about your results:
I: Summary. This should include summary plots of describing the relationship between your explanatory and
response variable, and any numerical summaries you find interesting.
II: Data Preparation. Consider outliers and influential points. Note: You may fit a model first, then consider
outliers and influential points.
III: Model Selection/Analysis. Perform model selection. You may choose either correctness or prediction as your
goal, but if you choose prediction you must use a model selection technique (you can not default to the largest
model).
This section should include the results of your “best” model fit, including which model selection criteria you
used, what your final model was, any confidence intervals or hypothesis tests you will interpret in a later
section, and the estimated logistic regression function.
For this section, it is your choice whether or not to include interaction terms. Or, you may first determine
which single terms are important, then see if any interactions to do with those terms are also important.
IV: Interpretation: Interpret the coefficients and any confidence intervals or p-values that you calculated.
V: Prediction: Predict π̂, and report back measures of prediction. Consider model goodness-of-fit measures,
error matrices, etc.
• If you choose Problem 1: Based on your “best” model, i.e, you may not use all of these values,
predict the probability of prostate cancer diagnosis for someone with 10 psa, 5 c.vol, 40g for weight,
age 67, with 2.5 benign, with no seminal vesicle invasion, and with 0.5 cm cap.
If you choose Problem 2: Based on your “best” model, i.e, you may not use all of these values,
predict the probability of angina for a 50 year old who has never smoked, with history of hypertension,
angina, and stroke (and no other history of medical issues).
VI: Conclusion: One or two sentences on what variables you found were most important to your model, and how
they affected your outcome.
Details
Your report should be the following format:
i. Typed.
ii. A title page including your name/s, the name of the class, and the name of your instructor (me).
iii. Double-sided pages.
iv. An appendix of your R code used to produce the results. Do not include in R code in the body of your report.
For example, your project should be put together in the following order (stapled):
Cover Page
Parts I-VI
Code appendix
Feel free to make your cover page “unique” so that it is easy to find when I hand them back.
Notice: your project will be graded as a group effort (if you have two people). This means that you are responsible
for your own work, and your partners work. I will not assign two different grades to one project.
3
y,psa,c.vol,weight,age,benign,inv,cap
0,0.651,0.5599,15.959,50,0,no-invasion,0
0,0.852,0.3716,27.66,58,0,no-invasion,0
0,0.852,0.6005,14.732,74,0,no-invasion,0
0,0.852,0.3012,26.576,58,0,no-invasion,0
0,1.448,2.117,30.877,62,0,no-invasion,0
0,2.16,0.3499,25.28,50,0,no-invasion,0
0,2.16,2.0959,32.137,64,1.8589,no-invasion,0
0,2.34,1.9937,34.467,58,4.6646,no-invasion,0
0,2.858,0.4584,34.467,47,0,no-invasion,0
0,2.858,1.2461,25.534,63,0,no-invasion,0
0,3.561,1.284,36.598,65,0,no-invasion,0
0,3.561,0.2592,36.598,63,3.5609,no-invasion,0
0,3.561,5.0028,20.491,63,0,no-invasion,0.5488
0,3.857,4.3929,20.086,67,0,no-invasion,0
0,4.055,3.3535,31.187,57,0,no-invasion,0.6505
0,4.263,4.6646,21.328,66,0,no-invasion,0
0,4.349,0.657,33.784,70,3.4556,no-invasion,0.5488
0,4.437,9.8749,38.475,66,0,no-invasion,1.4477
0,4.759,0.5712,26.311,41,0,no-invasion,0
0,4.953,1.1972,46.063,70,5.2593,no-invasion,0
0,5.155,3.1582,30.569,59,0,no-invasion,0
0,5.259,7.846,33.115,60,4.3492,no-invasion,3.8574
0,5.474,0.5827,29.371,59,0.4493,no-invasion,0
0,5.529,5.9299,31.5,63,1.5527,no-invasion,3.2544
0,5.641,1.477,39.252,69,4.953,no-invasion,0
0,5.871,4.2631,22.646,68,1.3499,no-invasion,0
0,6.05,1.6653,41.264,65,0,no-invasion,0.4493
0,6.172,0.6703,47.942,67,6.1719,no-invasion,0
0,6.36,2.8292,22.874,67,1.2461,no-invasion,1.0513
0,6.619,11.134,29.371,65,0,no-invasion,5.0531
0,6.821,1.3364,59.74,65,7.0993,no-invasion,0.4493
0,7.463,1.1972,450.339,65,5.4739,no-invasion,0
0,7.463,3.5966,20.905,71,3.5609,no-invasion,0
0,7.538,1.0101,26.311,54,0,no-invasion,0
0,7.768,0.99,25.028,63,0,no-invasion,0.4493
0,8.085,3.7062,61.559,64,8.7583,no-invasion,0
1,8.671,4.1371,38.861,73,0.5599,no-invasion,5.2593
0,8.935,1.5841,10.697,64,0,no-invasion,0
0,9.116,14.2963,59.74,68,3.9354,invasion,6.2339
0,9.777,2.2255,20.287,56,2.56,no-invasion,0.8521
1,9.974,1.8589,23.104,60,0,no-invasion,0
0,10.074,4.2207,39.646,68,0,no-invasion,0
0,10.278,1.786,47.942,62,5.529,no-invasion,0.6505
0,10.697,5.8709,49.402,61,0,no-invasion,2.2479
0,12.429,4.4371,30.265,66,5.7546,no-invasion,0.6505
0,12.807,5.2593,29.666,61,1.8589,no-invasion,0
1,13.066,15.3329,54.598,79,6.5535,invasion,14.2963
0,13.066,3.1899,56.826,68,5.529,no-invasion,0.6505
0,13.33,5.7546,33.115,43,0,no-invasion,0
0,13.33,3.3872,35.517,70,3.9354,no-invasion,0.4493
0,14.296,2.9743,54.055,68,0,no-invasion,0
0,14.585,5.2593,68.717,64,7.9248,no-invasion,0
0,14.585,1.6653,37.713,64,4.4371,no-invasion,1.0513
0,14.732,8.4149,61.559,68,5.8709,no-invasion,4.2631
1,14.88,23.3361,33.784,59,0,no-invasion,0
0,15.18,3.5609,72.24,66,8.3311,no-invasion,0
0,16.281,2.6379,17.637,47,0,no-invasion,1.6487
0,16.281,1.5841,42.948,49,4.1371,no-invasion,0
1,16.61,1.716,65.366,70,1.5527,no-invasion,0
0,16.61,2.8864,46.993,61,3.6328,no-invasion,0
0,17.116,1.5841,91.836,73,10.2779,no-invasion,0
0,17.288,7.3891,41.264,63,5.0531,invasion,6.7531
1,17.288,16.119,33.784,72,0,no-invasion,4.7588
0,17.814,7.6141,50.4,66,7.4633,invasion,8.2482
0,17.814,7.9248,37.338,64,0,no-invasion,0
0,17.993,4.306,46.525,61,3.7434,no-invasion,0.6505
0,18.541,7.5383,48.424,68,5.9299,no-invasion,3.7434
0,19.298,9.025,57.397,72,10.0744,no-invasion,0.6505
0,19.298,0.6376,82.269,69,0,no-invasion,0
0,19.492,3.2871,119.104,72,10.2779,no-invasion,0.4493
0,20.287,6.4237,36.234,60,0,invasion,3.7434
0,20.905,3.1899,28.219,77,5.7546,no-invasion,0
0,21.328,3.3535,46.063,69,0,invasion,1.2461
1,21.758,6.2965,25.534,60,1.5527,invasion,3.2544
1,26.576,20.0855,46.993,69,0,invasion,6.7531
0,28.219,23.1039,26.05,68,0.9512,invasion,11.2459
1,29.666,7.4633,83.931,72,8.3311,no-invasion,1.6487
1,31.187,12.6797,77.478,78,10.2779,no-invasion,0
0,31.817,14.154,35.874,69,0,invasion,13.1971
1,33.448,16.119,45.604,63,0,no-invasion,1.4477
0,33.784,4.3492,21.542,66,1.7507,no-invasion,1.2461
0,34.124,12.3049,32.137,57,1.5527,no-invasion,10.2779
0,35.517,13.5991,48.911,77,0.5886,invasion,1.7507
1,35.517,14.5851,46.525,65,3.0649,no-invasion,5.7546
1,36.234,4.7588,40.854,60,5.4739,no-invasion,2.2479
1,37.713,27.1126,33.784,64,0,invasion,10.2779
0,39.646,7.5383,41.679,58,5.1552,no-invasion,0
0,40.854,5.6407,29.079,62,0,invasion,1.3499
1,53.517,16.6099,112.168,65,0,invasion,11.7048
1,54.055,4.7588,40.447,76,2.56,invasion,2.2479
0,56.261,25.7903,60.34,68,0,no-invasion,0
0,62.178,12.5535,39.646,61,3.8574,invasion,0
1,80.64,16.9455,48.424,68,0,invasion,3.7434
1,107.77,45.6042,49.402,44,0,invasion,8.7583
1,170.716,18.3568,29.964,52,0,invasion,11.7048
1,239.847,17.8143,43.38,68,4.7588,invasion,4.7588
1,265.072,32.1367,52.985,68,1.5527,invasion,18.1741
y,age,smoke,cig,hyper,myofam,strokefam,diabetes
0,37,current,15,absent,no,no,no
0,45,never,0,absent,no,no,no
1,60,never,0,absent,yes,no,no
0,57,never,0,absent,no,no,no
1,65,current,20,mild,yes,no,no
1,56,current,20,absent,no,no,no
0,42,never,0,absent,no,no,no
1,52,current,30,absent,no,no,no
1,61,ex,0,absent,yes,no,no
1,59,ex,0,absent,yes,no,no
0,49,ex,0,absent,no,yes,no
0,56,ex,0,absent,no,yes,no
0,38,never,0,absent,no,no,no
1,66,current,12,absent,yes,no,no
0,49,never,0,absent,no,no,no
1,48,current,12,absent,no,no,no
1,61,current,10,absent,yes,no,no
1,57,current,30,absent,yes,no,no
0,52,current,0,absent,no,no,no
0,43,never,0,absent,no,no,no
1,53,ex,0,mild,yes,no,no
1,58,current,30,mild,no,no,no
0,41,current,4,absent,no,no,no
1,53,current,24,absent,no,no,no
0,61,never,0,absent,no,no,no
1,64,current,20,moderate,no,no,no
1,48,current,40,moderate,no,no,no
1,59,current,26,absent,no,no,no
0,41,never,0,absent,no,no,no
0,53,ex,0,absent,no,no,no
1,69,current,25,absent,no,no,no
1,61,current,10,moderate,yes,no,yes
1,57,never,0,absent,yes,no,no
1,57,ex,0,absent,yes,no,no
0,45,never,0,absent,yes,no,no
0,69,never,0,absent,no,no,yes
0,51,never,0,mild,no,no,no
0,55,never,0,mild,no,yes,no
0,47,current,5,mild,no,no,no
0,39,never,0,absent,yes,no,no
1,47,ex,0,mild,no,yes,no
1,62,current,20,mild,yes,no,no
1,54,ex,0,absent,yes,no,no
1,48,never,0,moderate,yes,no,yes
0,37,current,10,absent,yes,no,no
0,74,never,0,absent,no,no,no
1,54,current,25,absent,yes,no,no
0,63,never,0,absent,no,no,no
1,63,current,30,absent,no,no,no
0,58,never,0,absent,no,no,no
0,58,never,0,absent,no,no,no
0,69,never,0,absent,no,no,no
1,38,current,15,absent,yes,no,no
1,58,ex,0,absent,yes,no,no
1,38,current,15,absent,yes,yes,no
1,46,ex,0,absent,no,no,no
1,59,current,40,absent,yes,no,no
1,60,ex,0,absent,yes,no,yes
0,49,never,0,absent,no,no,yes
1,31,current,40,mild,no,no,no
1,53,current,5,absent,no,no,no
1,68,ex,0,absent,yes,no,no
0,47,never,0,absent,no,no,no
0,62,never,0,absent,no,no,no
0,45,current,30,absent,yes,no,no
1,48,never,0,mild,no,no,no
0,39,ex,0,absent,yes,no,no
1,55,current,10,absent,yes,no,no
0,42,never,0,absent,yes,no,no
1,68,current,5,absent,no,no,no
0,51,never,0,absent,no,no,no
1,59,ex,0,absent,no,no,yes
0,38,current,20,absent,no,no,no
1,64,current,30,moderate,no,no,no
0,43,current,20,absent,yes,no,yes
1,59,ex,0,absent,no,no,no
1,61,current,10,mild,no,no,no
0,41,never,0,absent,yes,no,no
0,41,never,0,absent,no,no,no
1,56,never,0,absent,yes,no,no
1,56,ex,0,absent,yes,no,no
1,68,never,0,absent,no,no,no
0,65,never,0,absent,no,no,no
1,51,ex,0,absent,yes,no,no
1,48,current,40,absent,yes,no,no
1,48,current,10,absent,yes,no,no
0,47,never,0,absent,no,no,no
0,46,ex,0,absent,no,no,no
0,52,never,0,absent,no,no,no
0,54,current,30,absent,no,no,no
1,60,never,0,mild,yes,no,yes
0,44,ex,0,absent,no,no,no
0,51,never,0,absent,no,no,no
1,58,current,40,absent,yes,no,no
1,70,ex,0,absent,yes,no,no
1,67,never,0,moderate,yes,yes,no
1,64,ex,0,mild,yes,no,no
0,29,never,0,absent,no,no,no
1,48,never,0,moderate,no,no,no
0,39,never,0,absent,no,no,no
0,51,never,0,absent,no,yes,no
1,38,never,0,absent,no,no,no
0,49,never,0,absent,no,no,no
1,73,current,10,absent,no,no,no
0,51,never,0,absent,no,no,no
1,64,ex,0,absent,yes,no,yes
1,49,current,20,absent,no,no,no
1,55,current,20,mild,yes,no,no
0,41,never,0,absent,no,no,no
0,41,never,0,mild,no,no,no
1,56,current,34,moderate,no,no,no
0,44,never,0,absent,no,no,no
0,50,ex,0,absent,no,no,no
0,53,ex,0,mild,no,no,no
0,41,never,0,absent,no,no,no
1,48,ex,0,mild,no,no,no
1,39,current,10,absent,no,no,no
1,49,ex,0,absent,yes,yes,no
0,51,never,0,absent,no,no,no
0,44,never,0,absent,no,no,no
1,59,ex,0,absent,yes,no,no
0,48,never,0,absent,no,no,no
1,66,ex,0,absent,yes,no,no
1,57,ex,0,absent,yes,no,no
1,59,current,24,absent,no,no,no
0,53,never,0,absent,no,no,no
1,64,current,20,absent,no,no,no
0,49,current,20,absent,no,no,no
0,51,current,10,mild,no,no,no
1,63,ex,0,absent,no,no,no
0,56,never,0,absent,no,no,no
1,40,current,15,absent,yes,no,no
1,56,never,0,mild,no,no,no
0,43,never,0,absent,no,no,no
0,37,current,20,absent,no,no,no
1,39,current,20,mild,no,no,no
1,56,current,20,absent,yes,no,no
1,53,current,26,absent,no,no,no
1,55,current,10,absent,yes,no,no
1,59,current,30,absent,no,no,no
0,41,never,0,absent,no,no,no
0,52,ex,0,absent,no,no,no
0,46,current,10,absent,yes,no,no
1,43,current,20,mild,yes,no,no
0,53,ex,0,absent,no,no,no
1,61,current,20,mild,yes,no,no
0,55,never,0,mild,no,no,no
0,41,never,0,absent,no,no,no
1,29,current,20,absent,no,no,no
1,33,ex,0,mild,yes,no,no
0,45,ex,0,moderate,no,no,no
0,40,never,0,absent,no,no,no
0,56,never,0,absent,yes,no,no
0,46,current,10,mild,no,no,no
1,53,ex,0,absent,yes,yes,no
1,57,current,20,absent,no,yes,no
1,56,current,25,absent,no,no,no
0,46,ex,0,absent,yes,no,no
1,46,current,20,absent,yes,no,no
1,63,current,20,mild,no,no,no
1,64,current,32,absent,no,no,no
0,49,current,20,absent,no,no,no
0,55,never,0,absent,yes,no,no
0,45,current,20,absent,no,no,no
0,58,never,0,mild,no,no,no
1,64,ex,0,absent,yes,no,no
1,64,current,15,mild,yes,no,no
0,58,never,0,moderate,no,no,no
1,66,current,40,absent,yes,no,no
0,45,never,0,absent,no,no,no
1,67,current,25,mild,no,no,no
0,41,never,0,absent,no,no,no
0,49,never,0,absent,no,no,no
0,37,never,0,absent,no,no,no
0,53,ex,0,absent,no,yes,no
1,58,current,30,absent,yes,no,no
0,49,never,0,absent,no,no,no
0,48,current,13,absent,no,no,no
1,59,current,30,absent,no,no,no
0,49,current,3,mild,no,no,no
1,48,current,15,mild,no,no,no
1,68,never,0,moderate,no,no,no
0,44,never,0,mild,no,no,no
0,42,current,30,absent,no,yes,no
0,65,never,0,mild,no,no,no
0,59,never,0,mild,no,no,no
0,49,current,6,absent,no,no,no
1,51,current,30,mild,no,no,no
0,50,never,0,absent,no,no,no
1,39,current,15,absent,no,no,no
0,39,ex,0,absent,no,no,no
1,59,never,0,moderate,no,no,no
1,51,current,25,absent,yes,no,no
1,70,current,15,absent,yes,no,no
0,54,never,0,mild,no,no,no
0,58,ex,0,moderate,no,no,no
0,57,never,0,absent,no,no,no
0,65,current,10,absent,no,no,no
1,73,current,15,mild,yes,no,no
0,53,never,0,absent,no,no,no
–title: “STA-138-R-Handout-Week-9-Model-Selection-And-Diagnostics
date: “November 10, 2015”
output: html_document
–# Model Selection and Diagnoistics in Logistic Regression
This handout will go over traditional model selection techniques, diagnostics, and
more advanced modeling techniques in logistic regression.
Note, this handout requires the installation of (using the following commands):
`install.packages(“bestglm”)`
`install.packages(“LogisticDx”)`
`install.packages(“caret”)`
### The data
We will use data on German credit ratings, in the package `caret`. First I will
reduce the number of variables (from 62) to something a big more manageable (for
now!):
“`{r}
library(caret)
data(GermanCredit)
small.credit =
GermanCredit[,c(“Duration”,”Amount”,”Age”,”ResidenceDuration”,”ForeignWorker”,”Hous
ing.Rent”)]
small.credit$y = ifelse(GermanCredit$Class == “Good”,1,0)
full.model = glm(y ~. , data = small.credit,family = binomial(link=logit))
“`
The variables I choose to look at are `Class` (the response, 1 = good credit, 0 =
bad credit), the duration in months of the customers loan, the credit score amount,
the age of the customer, the duration the customer has lived in Germany (a score),
an indicator of if they are a foreign worker (1 = no, 0 = yes), and an indicator if
they rent a house (1 = yes, = 0 no).
### 1. Calculating AIC, BIC for a particular model.
To retrieve the AIC and BIC of a particular model, you can use the functions `AIC`
and `BIC`:
“`{r}
full.AIC = AIC(full.model)
full.BIC = BIC(full.model)
c(full.AIC, full.BIC)
“`
However, these will not match the formulas given in class (they are missing a
multiple of a constant), thus, I have also built a function that will give you many
criteria and some more useful information:
“`{r}
All.Criteria = function(the.model){
p = length(the.model$coefficients)
n = length(the.model$residuals)
the.LL = logLik(the.model)
the.BIC = -2*the.LL + log(n)*p
the.AIC = -2*the.LL + 2*p
the.results = c(the.LL,p,n,the.AIC,the.BIC)
names(the.results) = c(“LL”,”p”,”n”,”AIC”,”BIC”)
return(the.results)
}
“`
When you give this function a model, it returns all the above information:
“`{r}
All.Criteria(full.model)
“`
You will have to copy and paste this function definition every time you want to use
it and have restarted R.
You can also list a subset of specific models, and find all the above criteria for
all the models. For example, here are all models involving just X1, X2, X3. First
I will rename the variables to make things a bit easier for myself:
“`{r}
old.names = names(small.credit) # To save the old names
names(small.credit)
names(small.credit) =c(“X1″,”X2″,”X3″,”X4″,”X5″,”X6″,”y”)
“`
Now, I will list all the model formulas I would use:
“`{r}
All.Models = c(“y~1″,”y~X1″,”y~X2″,”y~X3″,”y~X1+X2″,”y~X1+X3″,”y~X2+X3”,
“y~X1+X2+X3″)
“`
Then, I will use an `sapply` to apply the criteria function to all of the above
models.
“`{r}
all.model.crit = t(sapply(All.Models,function(M){
current.model = glm(M,data = small.credit,family = binomial)
All.Criteria(current.model)
}))
RC = round(all.model.crit,3)
RC
names(small.credit) = old.names
“`
You would have to change the names of your dataset, and the name of your data
(`small.credit` above) to use this code. You could also modify the list of
possible models, as I have only written out all models involving X1, X2, X3.
### 2. PRE- Proportion of Reduction in Error
This is relatively easy to calculate in R:
“`{r.message = ” “,comment = ” “}
prop.red = 1- sum((full.model$y -full.model$fitted.values)^2)/sum((full.model$y mean(full.model$y))^2)
prop.red
“`
You would have to change `full.model` to whatever the name of your model is.
### 3. Model selection
### 3.1 Forward stepwise selection.
The first step is to also fit the empty model, and then we may use the `step`
function in `R …
Purchase answer to see full
attachment

Order your essay today and save 10% with the discount code ESSAYHSELP