Category | Math |
---|---|

Subject | Statistics |

Difficulty | Undergraduate |

Status | Solved |

More Info | Online Help With Statistics |

## Assignment Description

University at Buffalo, Industrial and Systems Engineering

IE322 Analytics and Computing for Industrial

Engineers

Homework#2 Fall 2017

Name:

Due 23:59 September 26, 2017

1. Use the following data to the answer the questions:

1,2,2,6,6,6,7,8,9,10,10,10

a.! Bin the data into three bins of equal width (width = 3).

b.! Bin the data into three bins of four records each.

c.! Clarify why each of the binning solutions above are not optimal.

d.! Propose a new binning method.

2.! Define outliers and discuss two methods for identifying them.

3.! Use the following stock price data (in dollar) for the questions:

14, 20, 12, 18, 4, 9, 15, 7, 10, 75, 12, 10, 16, 8

a.! Find the min-max normalized stock price for the stock worth $20.

b.! Compute the Z-score standardized stock price for the stock worth $20.

c.! Find the decimal scaling stock price for the stock worth $20.

d.! Calculate the skewness of the stock price data.

e.! Is the distribution symmetric? If not, propose some methods to make it more symmetric, and verify your proposed methods.

f.! Check if there is any outlier, using the Z-score method.

g.! Check if there is any outlier, using the IQR method.

h.! Investigate how the outlier affects the mean and median by doing the following:

i.! Compare the mean score and the median score, with and without the outlier.

4.! Discuss advantages and disadvantages of replacing missing data with:

a.! mean and mode,

b.! random value

c.! constant values specified by the Analyst

5.! Figures (a), (b), (c) and (d) are normal probability plot (QQ plot) of a data

a.! Compare figures (c) and (d) in terms of normality. Justify your answer.

b.! Compare figures (a) and (b) in terms of skewness (left or right). Justify your answer.

(a) (c)

(b) (d)

6.! Why a data analyst needs to eliminate skewness and transform data to achieve normality?

7.! Below is a table of the house price. It seems that there may have some errors in the table of data. List all the errors you can find.

Zip code | Latitude | Size (feet | Number of Bedrooms | Number of floors | Age of home | Price $1000 |

14227 | 33.14 | 2000 | 4 | 1 | 1 | 470 |

12314 | 32.64 | 1431 | 3 | 3 | 41 | 320 |

14270 | 3341 | 1029 | 0 | 80 | -1 | 120 |

1234 | 41.10 | 1h90 | 1 | 2 | 50 | 270 |

14r43 | 50 | 1011 | 1 | 2 | 100 | 432 |

14228 | 51.15 | 122 | 3 | 2 | 12 | 653 |

10862 | 65.14 | 2187 | 2 | 3 | 15 | 138 |

1121 | 40.00 | 1231 | 43 | 3 | 54 | 546 |

8.! Create new flag variable(s) for the above table of the house price to
describe the price of the house as *Low, Median, High*. Show the
variable(s) you created and the criteria.