- Details
- Parent Category: Programming Assignments' Solutions
We Helped With This Python Programming Assignment: Have A Similar One?
Short Assignment Requirements
Assignment Description
BIOL 419/519 Homework 3, Winter 2019
Due on Monday, February 4 at 11:59pm
Instructions: Submit the Jupyter notebook of your work. Your notebook solutions will include the code your wrote to solve the problem as well as the output/answer. Each part of each problem should be in a separate cell (or multiple cells) with clear comments labeling them, so that their outputs are easily found by the grader!
Expectations: Please seek help if you need it! You may ask questions at Friday’s lab, come to office hours, and get together with your classmates to troubleshoot together.
Collaboration: As noted in the Syllabus, what you turn in should reflect your own understanding of the material. Collaboration with your classmates is encouraged, and I ask you to clearly indicate these collaborations as comments your homework.
1. (5 pts) Random Walks
Random walks are wonderful models of many biological processes; to name just a few examples, random walks can model the trajectory of proteins in a cytosol, swimming E. coli, membrane voltage fluctuations in a neuron, decision-making, and population demography. A random walk describes a mathematical process where a path is formed by a sequence of steps, each of which is random. Here we are going to write some code to explore properties of a random walk process.
Let’s first consider a one-dimensional random walk, say of a bacterium in narrow tube. A bacterium starts at position x = 0 at time k = 0. At each time step, the bacterium takes one step of length 1 either up or down (that is to say, x changes by +1 and −1 at each step with equal probability.)
(a) (1 pt) Write a function randomwalk that simulates the path of one bacterium after K steps. The function should take K as an input parameter. The output should be an array with the location of your bacterium at each of the K steps. Plot the path of 10 different random bacteria taking K = 500 steps, putting x position on the vertical axis and the number of steps k on the horizontal axis. Each path should be in a different color.
Reminder: As always, label your plots!
Figure 1: This is an excellent short textbook by Howard Berg on Random Walks in Biology.
(b) (1 pt) Using your function, simulate and plot the distribution of where 10000 bacterium end up (i.e., their x positions) after K = 500 steps. This plot should be a histogram with enough bins to give a sense of this distribution nicely. What is the mean and standard deviation of this distribution?
(c) (1 pt) Plot the mean and standard deviation of 10000 bacteria’s positions after K = 100, 200, 500, 1000, 5000, and 10000 steps. Choose a type of plot suitable for conveying this information succinctly. What pattern do you notice in the results (i.e., how do mean and std vary as a function of increasing K)
(d) (2 pt) Write another function randomwalk biased that stimulates the path of one bacterium after K steps where its random walk is biased to go in one direction over the other. This function should take two input parameters: K is the number of steps; and p is a number between 0 and 1 and determines the probability of going up. In other words, the bacterium takes a step +1 with probability p and a step −1 with probability 1 − p.
This number p roughly models chemosensory or thermosensory behavior, where the bacteria is seeking a source of food or avoiding an aversive stimulus, which biases its probability of going towards or away from the stimulus.
Using you randomwalk biased function, stimulate and plot the distribution of where 10000 particles end up after K = 500 steps with p = 0.45, and again with p = 0.2.
(e) (Extra Credit, 1 pt) Repeat this exercise for a random walk in two (or more!) dimensions. A particle in 2D starts at position x = 0 and y = 0 and takes one step either to the northwest, northeast, southeast, or southwest with equal probability. What is the expected (mean) and std of position of the particle after K steps, where K is large?
2. (2 pts) Field Trip
Consider this entirely plausible scenario: You had just spent a few months studying the biodiversity in a remote region of the Bog of Eternal Stench, and your return to Seattle required you to employ three different types of transportation: a swamp boat, a regional bus, and a commercial airplane.
After getting home, you were filling out a reimbursement form, when you realized—to your horror—you didn’t make note of the distance you had traveled using each mode of transportation. Curiously, all you remember are the following three facts.
• The total distance you traveled was 10060 kilometers.
• Twice the distance on the swamp boat plus the distance on the bus was 100 kilometers.
• The airplane trip was 5000 kilometers longer than 100 times the bus trip.
(a) (1 pt) How many unknowns are in this problem? Write down a system of equations to solve for the unknowns. (It’s okay to do this part as a comment or markdown cell in your notebook.) Convert your system of equations to matrix/vector notation (Ax = b).
(b) (1 pt) Solve your system of equations for the unknowns using numpy.
3. (3 pts) Leaves of Grass
In this problem, you are going to practice the entire process of pulling data from text files, manipulating the data, visualizing the data, and starting to analyze the data. I expect that you will have to do quite a bit of self-teaching along the way to solve this problem—this is an important skill for you to practice! If you don’t know how to do it, I suggest that you start by looking through the class notes, then go to the recommended outside resources, then search for keywords on the internet, then talk to your classmates. Don’t be too shy and stubborn to collaborate!
The class Canvas website contains some data files you should download that contains data about the shapes of hundreds of species of leaves; this data was originally downloaded from:
https://archive.ics.uci.edu/ml/datasets/One-hundred+plant+species+leaves+data+set Read the documentation that comes with the data and figure out what the data consists of.
Pull in the data for shapes of the leaves as a Pandas DataFrame. Here is a bit of code to get you started.
import numpy as np import pandas as pd data = pd.read_csv(’FILENAME_HERE’) print(’Data has shape’, data.shape) lshapes = data.values |
How many rows and columns does the data have? What information is in the first column?
Note: Be sure to pull in the right data file, there are multiple ones. Be careful with using the right delimiters for the particular datafile, as well as with the header lines (sometimes, the first rows of a file are column labels; sometimes, they are not).
(a) (1 pt) For each leaf sample (row of data), 64 numbers are given to describe its shape. Make a plot of these shape descriptors averaged over each plant species, where the horizontal axis are the 64 features, and the vertical axis contains the mean values of each feature for the species. The line for each species should be different colors.
Because there’s 100 species, it’s okay if some colors are used more than once.
(b) (1 pt) Make a histogram of the 32nd shape feature across all leaf samples.
(c) (1 pt) Make a scatter plot of the 16th shape feature on the horizontal axis and the 32nd shape feature on the vertical axis for all leaf samples. Each leaf should be a dot, and each species should have a different color.
Because there’s 100 species, it’s okay if some colors are used more than once. Along the same veins, a legend of 100 species is uninformative, so you don’t need to make one for this plot.
4. (Informational) How many hours did you spend on this homework? How many of those hours were spent working alone (as opposed to in a group)?