- Details
- Parent Category: Programming Assignments' Solutions

# We Helped With This R Programming Assignment: Have A Similar One?

Category | Programming |
---|---|

Subject | R | R Studio |

Difficulty | College |

Status | Solved |

More Info | Probability Homework Help |

## Assignment Description

Complete the following and submit it. Include all commands and output - you can paste these from the R console.

1. The data set ’aatemp’ in the ’faraway’ package contains observations on annualmean temperatures in Ann Arbor Michigan from 1854 to 2000.

(a) Produce a scatterplot of ’temp’ versus ’year’ with an overlaid regressionline. Does there appear to be a linear relationship between the two variables? Does it appear a data transformation will be necessary to achieve a linear relationship?

(b) Use
the function *boxcox() *in the library ’MASS’ to determine whether ’temp’
needs to be transformed. Recall that if the confidence interval for the
parameter *λ *contains the value *λ *= 1 then a transformation is not
necessary, while if the value *λ *= 1 is not contained in the interval
than a transformation may be necessary. What transformation do you decide on?

2. The data set ’CarnivoreAbundanceMass.csv’ found on eCollege contains observations on the mass and the relative abundance of numerous carnivorous predators in an area. Biologists expect the abundance of a predator will depend upon the mass of the predator, since the larger the predator the more prey will be required to sustain it, and thus the fewer of these predators an area can sustain.

(a)

Produce a scatterplot of ’Abundance’ versus ’Mass’ with an overlaid regression line. Does there appear to be a linear relationship between the two variables? Does it appear a data transformation will be necessary to achieve a linear relationship?

(b) Use
the function *boxcox() *in the library ’MASS’ to determine whether
’Abundance’ needs to be transformed. What is the value of *λ *for the
transformation?

(c) Transform
’Abundance’ using a Box-Cox transform with this value of *λ*.

(d) Now produce a scatterplot of ’Abundance’ versus ’Mass’ with an overlaidregression line. Does there appear to be a linear relationship between the two variables? Keep in mind that only the dependent variable has been transformed and it may be necessary to transform both.

(e) Use
the *log() *function to log-transform both ’Abundance’ and ’Mass’. Now we
are transforming both variables. Produce a scatterplot of the log-transformed
’Abundance’ versus the log-transformed ’Mass’. Does it now appear there is a
linear transformation between the variables?

(f) We’ll cheat a bit and transform the predictor. Use the *log() *function
to log-transform ’Mass’. Now determine a Boxcox transformation for ’Abundance’.
Does it get it right?

(g) Use ’broken stick regression’ to regress ’Abundance’ versus the log-transformed’Mass’, with two linear segments and one ’knot’. Produce a scatterplot of ’Abundance’ versus the log-transformed ’Mass’ to determine a suitable knot. Then add the fitted line segments to the plot. How does it look?

1

3. The file ’SimData.csv’ contains two variables ’x’ and ’y’ that I simulated. Determine an appropriate transformation for the response ’y’ to achieve a linear relationship between ’x’ and ’y’, then perform ordinary least squares regression and report the results. Comment on significance, coefficient values, and how well the model fits the data.

4. Consider the variables ’PK’ and ’DAPE’ in the Koermer data. You will use robustregression techniques for ’PK’ versus ’DAPE’ in the Koermer data.

(a) Produce a scatterplot - there are many ’99’ values in ’PK’ correspondingto missing values, i.e., days where there was not a convective wind episode. Omit these observations by copying ’PK’ and ’DAPE’ into ’PKnew’ and ’DAPEnew’ with these values omitted.

(b) Produce a scatterplot of ’PKnew’ and ’DAPEnew’. Are there outliers?

(c) Perform
OLS with *lm() *and robust regressions with *rlm() *letting *ψ *be
’psi.huber’, ’psi.hampel’, and ’psi.bisquare’. Use the ’psi’ argument to *rlm()*.
Compare the results.

5. Simulate 50 observations of standard Normal noise and read these into vectors ’x’and ’y’:

> x <- rnorm(50,0,1)

> y <- rnorm(50,0,1)

(a)

Produce a scatterplot and calculate the correlation between ’x’ and ’y’.What do you see? Does linear regression seem appropriate?

(b) Now add the following outlier

> x[51] <- 10

> y[51] <- 10

Produce a scatterplot and calculate the correlation between ’x’ and ’y’. What do you see? Does linear regression seem appropriate?

(c) Perform
ordinary least squares regression using *lm() *and report the results.

(d) Perform
robust regressions with *rlm() *letting *ψ *be ’psi.huber’,
’psi.hampel’, and ’psi.bisquare’. Use the ’psi’ argument to *rlm()*.
Compare the results.

(e) Examine the weights from the robust regressions - how was the outliertreated?

(f) Now use least trimmed squares regression with *ltsreg()*,
including inference. How do the results compare with OLS and robust regression?

6. The file ’h6p6.csv’ on eCollege contains observations on two variables ’x’ and ’y’.Construct an appropriate polynomial regression model for ’y’ as a function of ’x’. Your model will be appropriate when the assumptions on the residuals are satisfied.

2