- Details
- Parent Category: Programming Assignments' Solutions

# We Helped With This R Language Programming Homework: Have A Similar One?

Category | Programming |
---|---|

Subject | R | R Studio |

Difficulty | Undergraduate |

Status | Solved |

More Info | Applied Statistics Homework Help |

## Short Assignment Requirements

## Assignment Description

FE515A4

Due: April 9th, 2018 11:59 PM

### Question 1: Stationarity and normality test (20 Points)

XLF is an ETF constructed by SPDR using equities from Financial Select Sector. For this question, you are required to download following equities from Yahoo Finance.

• XLF

• BAC • C

• JPM

• WFC

• GS

For all the equities above, you need to download 5 year length of daily data and do following tasks:

1. Calculate daily return (either type) and test its stationarity and normality. In the end, you should construct a table which contains three columns (Equity name (character), Stationary result(True or false), normality result(Ture or false))

2. Calculate monthly return (either type) and repeat step 1. In the end,what’s your observation when comparing the result for tests of daily return and tests of monthly return?

### Question 2: Linear regression (40 Points)

For this question, you need to use the data from question 1 and construct a portfolio.

1. Build a linear model using XLF as respond variable and remaining
equitiesas explanatory variables. Denote this model as **Model 1**.

2. Obtain
the summary report from the linear model and explain all variables(*α *and *β*). Are they significant?

3. Calculate the correlation between XLF and remaining equities.
Select three equities which has the highest correlation and construct another
linear model. Denote this model as **Model 2**.

4. Compare Model 1 and Model 2, which model has better performance?

### Question 3: Gradient descent (40 Points)

Use gradient Descent to find local minimum of the function:

*f*(*x*)
= (*x*^{2 }+ 3*x *− 5) ∗ *cos*(*x*) (1)

To solve question, you need to use following parameters

• Set *α *= 0*.*001 as step size, you can decrease this
number if necessary.

• Define
the * *value by yourself.

For this question, you need to finish following tasks:

1. Calculate
the differential equation *df *of *f(x)*

2. Make a plot for the *f(x) *and set the value range of x as
[0, 100]

3. Try at least 10 different starting points between [0, 100]. In the end, youshould obtain 20 local minimum values.

4. Add these local minimum values to the plot from Step 2. Additionally,put proper X label, Y label, title and legend to your plot.

5. As you may notice, the alpha value (step size) is critical for gradient descent. When you decrease the alpha value, it may increase the calculation time. When you increase this number, it may shorten the calculation but fail to return you the right estimation. Try three different alpha values and compare the calculation time and accuracy. Which alpha value has the best performance? Write down your comments.

### Bonus question: What is the right time interval? (30 Points)

As we mentioned during class. In most case, we assume the return of the equity is normal distributed. However, this case may not be true when using the real data. In this question, we going to investigate more on this using HFT data.

• (10 Points)Using **SPY.csv **to calculate the log return for
each minute and set the time window into 15 minutes (9:30 ∼ 9:45, 9:45 ∼ 10:00,
...). For each data sets, are they normal distributed?

• (20 Points)Instead of using the fixed time interval, can you think another method to divide this time interval depends on the data? For example, when you using this data set, you may find the returns from 9:30 to 10:00 are apply to normal distribution. However, it may not when using data from 9:30 to 10:01. Therefore, the first time interval is from 9:30 to 10:00, the second time interval starts from 10:00 to another time.

Additional information: To do this task, you need to have a basic data set as bench mark. For example, you can set the initial 5 points as the initial data set.