- Parent Category: Programming Assignments' Solutions
We Helped With This R Language Programming Assignment: Have A Similar One?
|Subject||R | R Studio|
|More Info||I Need Help With Statistics|
Short Assignment Requirements
Data Analysis Script 2
This final assignment will provide you with the opportunity to demonstrate the full range of skills you have developed during the course. This Task will comprise mainly summative elements.
You will work individually to provide an R script of no more than 3000 words that details your working and outlines (as script annotations) how you would go about solving a real-world ecological problem. Submission should comprise a single MS Word file. Construct this file by copying the contents of your R script into a blank Word file, and then saving it with the appropriate name.
Submit your Word file containing your R script via SafeAssign on the Assessment tab of the Course’s BlackBoard page before 23:59 on Friday 28 October (Week 13).
From the list of problems provided, select , and write an annotated R script to:
1. Import the data into R;
2. Manipulate the data into a form that will allow exploratory analysis and model building;
3. Undertake basic exploratory analysis of the data, including plots, where appropriate;
4. Construct, fit and assess statistical models, including, where necessary, appropriate hypothesis tests;
5. Summarise outcomes of the model-fitting process;
6. Interpret the results clearly and succinctly;
7. Output MS Office-compatible Tables and Figures, as appropriate, to aid in the interpretation of your results;
8. Store outputs, as needed; and
9. Using annotations in your script, explain the rationale of each step and summarise associated results; interpret main patterns from plots or findings from hypothesis tests; and draw conclusions.
NOTE that pre-processing is to be done in Excel. Instead, steps in the process must be coded in R. I will assess the work by running your script on the original data, so if there are steps missing, it will cost you marks.
You will be assessed on:
· Clarity and completeness of scripting;
· Appropriateness of data manipulation and analysis;
· Quality of resulting outputs; and
· Depth and appropriateness and accuracy of rationale/explanation and interpretation of results.
You’re a wildlife manager, who is currently working on badgers. You are investigating the factors predicting badger sett (burrow) presence in England and Wales. Specifically, you’re interested in which of the available habitats in the landscape is preferred by the badger. Previous research suggests that badgers might prefer either woodland or grassland habitat. On the other hand, you know that, unfortunately, a good number of badgers get killed on the roads every year. And sadly, badgers are still nowadays culled by farmers, either to protect their produce or for the misconception that badgers are going to increase the risk of tuberculosis on their cattle.
Over the past eight years, you collected data each time an inhabited badger sett was detected, based on badger presence or signs of recent badger activity. You have also collected data on random sites to know the environmental properties of sites where badgers are not present. The English government has made available a map of the land use of your study area (habitat composition). You classify that into woodland, grassland or human disturbance (including crops and suburban infrastructure). The geographic coordinates of the badger setts allow you to assign them to a habitat type and region and to calculate distance to the nearest road. Finally, you source a historical database of culling licences granted to farmers by local governments, and you identify whether the site was being culled during the sampling interval.
The file Badgers.csv contains your results. The first sheet in the file contains metadata (a description of the variables), whist the second contains the data.
Construct models to address the following questions:
a. Are the number of on culling areas relative to those on non-culling areas independent of region and habitat type? Fit one model and produce one figure. [20 %]
b. Did habitat type, culling practices or distance to road affect likelihood of badger sett presence, and did this hold equally across the regions sampled? Fit one model and produce at least two figures. [80 %]
Be sure to explain all significant results in terms of your research questions (i.e., interpret model coefficients both graphically and in words), and speculate about what this result might mean in terms of badger conservation.
TASK 3 RUBRIC AND INDICATORS OF ATTAINMENT
(aka what I have to do to get an HD)
Assessment Criterion / Performance
Indicators of attainment
For each question:
Ability to frame research questions, hypotheses and predictions
· One clear and appropriate research question is formulated and articulated
· A conceptual model is developed in relation to the research question that proposes explanations in advance of undertaking analyses
· Predictions are derived from the conceptual model that relate to the associated research question
· Predictions are converted into appropriate null and alternative hypotheses, and these are clearly and correctly articulated
[2 + 8]
Ability to appropriately construct, fit and assess statistical models
· The code for model fitting works without error
· Where appropriate, collinear predictors are identified and eliminated before the modeling process is initiated
· The first model specified and fit is the maximal model
· The model is appropriately simplified, if possible
· The minimum adequate model is identified, including tests of model assumptions, where appropriate
[7 + 28]
Ability to properly motivate each decision (including identifying variables for analysis), to interpret and/or explain each result, and to answer the research question
· A reason/motivation/rationale is provided for each line/section of code to be run (why am I doing this?)
· A brief description is provided for what each line of code is intended to achieve (what does it do?)
· The output of each line of code (that produces output in the console or graphics pane) is correctly and completely interpreted (what do the outputs mean)?
· Statistical results of hypothesis tests are emphasised, interpreted, and at least p-values are included (and correctly interpreted) in a brief discussion of the outputs
· Inferences regarding the research question/conceptual model/hypotheses are drawn when warranted
· A clear and correct answer is provided to address the research question
[6 + 24]
Ability to produce appropriate figures and/or tables for inclusion in a manuscript
· Tables and figures output are logical, and necessary in addressing the research question
· Table columns have appropriate headers
· Figures have appropriately labeled axes and legends, and they are accurate and easy to interpret in terms of the response variable
[4 + 12]
Ability to produce a concise, accurate script
· Data are imported and manipulated correctly/ appropriately
· Unnecessary lines of code are omitted
· The sequence of steps in the code is logical
· The code is error free
[2 + 8]