# Logistic Regression in R: Exercises and Solutions

In order to solve the tasks you need:

- R Studio
- Data Files

## Exercise 1

A study was conducted whereby the type of anesthetic (A or B), nausea after the surgery (Yes or No), the amount of pain medication taken during the recovery period, and age for a random sample of 72 patients undergoing reconstructive knee surgery.

The data is in the file `anesthesia`

.

### Part 1a

Use R to create a two-way table with the type of anesthetic defining the rows and nausea after the surgery as the columns and also produce the output for a chi-square test for independence.

Is there an association between these two categorical variables at a 5% level of significance?

### Answer 1a

```
library(readxl)
anesthesia <- read_excel("anesthesia.xlsx")
tbl <- table(anesthesia$anesthetic, anesthesia$nausea)
tbl
```

```
##
## No Yes
## A 13 26
## B 23 10
```

```
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: tbl
## X-squared = 8.0559, df = 1, p-value = 0.004535
```

Since p-value = 0.004535 < 0.05, we reject the null hypothesis that the type of anesthetic and nausea after the surgery are independent. There is an association between these two categorical variables at a 5% level of significance.

### Part 1b

Obtain the output from R (including the Wald tests for coefficients - so use “summary” function) for the logistic regression model with nausea as the dependent variable and the type of anesthetic as the predictor variable.

### Answer 1b

```
anesthesia$nausea <- factor(anesthesia$nausea)
anesthesia$anesthetic <- factor(anesthesia$anesthetic)
mod <- glm(nausea ~ anesthetic, family = binomial, data = anesthesia)
summary(mod)
```

```
##
## Call:
## glm(formula = nausea ~ anesthetic, family = binomial, data = anesthesia)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4823 -0.8497 0.0254 0.9005 1.5453
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.6931 0.3397 2.041 0.04129 *
## anestheticB -1.5261 0.5088 -2.999 0.00271 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 99.813 on 71 degrees of freedom
## Residual deviance: 90.133 on 70 degrees of freedom
## AIC: 94.133
##
## Number of Fisher Scoring iterations: 4
```

### Part 1c

What is the outcome of the hypothesis test that the coefficient of **anesthetic** is “zero” vs “not zero” at a 5% level of significance? (use the Wald test from the R output from the logistic regression you performed)

### Answer 1c

At a 5% level of significance we reject the null hypothesis that the coefficient of **anesthetic** is “zero”, because p-value = 0.00271 < 0.05.

### Part 1d

Convert the estimated coefficient of **anesthetic** to an odds ratio and interpret it in the context of the problem.

### Answer 1d

```
## anestheticB
## 0.2173913
```

The odds of having nausea after the surgery if anesthetic B is used are 0.217 that of the odds of having nausea if anesthetic A is used.

### Part 1e

Install the package “mosaic” (if you don’t have it installed already), then load it. Use the oddsRatio function to compute the odds ratio for having nausea for anesthetic A vs B. You may have to refer back to Week 8 for details on odds ratios and the oddsRatio function in R.

### Part 1f

When logistic regression coefficients are negative, the interpretation sometimes has more impact when we switch the perspective and use the reciprocal of the exponentiated coefficient. Find the odds ratio for having nausea for anesthetic B compared to anesthetic A (i.e. take the reciprocal of the odds ratio you computed in part **1d**).

Interpret this odds ratio in the context of the problem.

### Answer 1f

```
## anestheticB
## 4.6
```

For anesthetic A, the odds of having nausea are 4.6 times larger than the odds for having nausea when anesthetic B is used.

### Part 1g

Compute the predicted probability of a reconstructive knee surgery patient having nausea after surgery when anesthetic A was used.

### Answer 1g

```
## 1
## 0.6666667
```

### Part 1h

Compute a 95% confidence interval for the predicted probability of a reconstructive knee surgery patient having nausea after surgery when anesthetic A was used.

### Answer 1h

```
pred <- predict(mod, newdata, type = "link", se.fit = TRUE)
invlink <- mod[["family"]][["linkinv"]]
z <- qnorm(1 - (1-0.95)/2)
low.ci <- pred$fit - z * pred$se.fit
up.ci <- pred$fit + z * pred$se.fit
prediction <- invlink(pred$fit)
low.ci.response <- invlink(low.ci)
up.ci.response <- invlink(up.ci)
data.frame(prediction = prediction, low.ci = low.ci.response, up.ci = up.ci.response)
```

```
## prediction low.ci up.ci
## 1 0.6666667 0.5068447 0.7955831
```

## Exercise 2

Continue using the `anesthesia`

data set to do the following.

### Part 2a

Obtain the output from R (including the Wald tests for coefficients - so use “summary” function) for the logistic regression model with nausea as the dependent variable and the amount of pain medication taken as the predictor variable.

At \(\alpha = 0.05\), is there a statistically significant relationship between nausea and the amount of pain medication taken?

### Answer 2a

```
##
## Call:
## glm(formula = nausea ~ painmed, family = binomial, data = anesthesia)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.8555 -0.6167 -0.1072 0.8206 1.7894
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.062742 0.764501 -4.006 6.17e-05 ***
## painmed 0.037487 0.008833 4.244 2.20e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 99.813 on 71 degrees of freedom
## Residual deviance: 68.049 on 70 degrees of freedom
## AIC: 72.049
##
## Number of Fisher Scoring iterations: 5
```

There is a statistically significant relationship between nausea and the amount of pain medication taken as p-value (for the coefficient of `painmed`

) = 2.20e-05 < 0.05.

### Part 2b

Convert the estimated coefficient of **painmed** to an odds ratio and interpret it in the context of the problem.

### Answer 2b

```
## painmed
## 1.038199
```

The odds for having nausea multiply by 1.038 for every 1-unit increase in the amount of pain medication taken during the recovery period.

### Part 2c

Compute the predicted probabilities of a reconstructive knee surgery patient having nausea in the recovery time after surgery for when 50 units of pain medication are used and also for when 100 units of pain medication are used.

Comment on these two probabilities.

### Answer 2c

```
## 1 2
## 0.2335485 0.6650716
```

The probability of a patient having nausea when 50 units of pain medication are used is 0.2335, while the probability of a patient having nausea when 100 units of pain medication are used is 0.6651.

It means that the odds of having nausea when 50 units of pain medication are used equals 0.3047, while the odds of having nausea when 100 units of pain medication are used equals 1.986. It shows that the odds for having nausea multiply by 1.038199 for every 1-unit increase in the amount of pain medication taken.