Let us worry about your assignment instead!

We Helped With This Python Programming Assignment: Have A Similar One?

SOLVED
CategoryProgramming
SubjectPython
DifficultyUndergraduate
StatusSolved
More InfoPython Assignment
321011

Assignment Description

LIFE733 Assignment 3 BioPython team working exercise

Deadline: 4pm 30/04/2018

 

Hand-in and assessment notes

Grade will come from the following 4 components:

-          Overall team score for quality of total analysis pipeline and team documentation (40%)

-          Individual score for code written by you as an individual (every Python class MUST have a comment at the top indicating your name) and documentation showing how you tested your parts (60%)

 

 

Specifications

-          Each team contains 3 or 4 members

-          A team of 3 MUST have:

o   One team leader

o   One BLAST developer

o   One Multiple sequence alignment (MSA) developer

-          A team of 4 MUST have:

o   One team leader

o   One BLAST developer

o   One Multiple sequence alignment developer

o   One Phylogenetics developer and literature mining developer

 

You have to construct a bioinformatics pipeline using BioPython

The pipeline will take as input (less than 10 in each case):

-          A set of protein sequences in FASTA format

-          A set of gene names + species name to look up

-          A set of protein identifiers

 

The pipeline will perform the following steps:

-          If protein identifiers are provided as input, it will retrieve these records from the provided FASTA file (uniprot-apicomplexa.fasta), reporting back to the user any records that could not be located

-          If a pair of gene name and species name is provided as input, it will attempt to retrieve these pairs from the FASTA file, or report an error if they cannot be found

-          It will extract these records to a temporary file (or use the protein sequences if provided in FASTA format) and perform a BLAST search against uniprot-apicomplexa.fasta

-          For each input protein sequence, the BLAST step should output those proteins passing a user selected threshold (e.g. e-value < E-10) to a new FASTA file, as well as producing some plots or graphics in suitably named png files

-          The FASTA files from the BLAST step are passed to the multiple sequence alignment (MSA) step. An MSA should be performed using each input FASTA file, producing output alignment files and plots to display the quality of the alignments produced.

-          (For teams of 4 people) the Phylogenetics developer should process trees (input as .dnd files) to display species names and gene names (communicating with other team members to ensure such data is available). Trees should be reformatted to label branches for particular species with a given colour, shown in the figure legend.

-          (For teams of 4 people) the Phylogenetics developer will use the gene name and species names to perform queries in pubmed, to retrieve any articles in which these search terms are found within the abstracts of articles.

-          For top marks, I would like to see the results of all steps assembled into one or several pdf files by the team leaders’ code, with figure legends and text indicating what each step contains.

 

Hand-in details:

1.       Team leader uploads to VITAL a zip containing:

a.       All code from the entire team – each class is written by one individual only and flagged as such in a comment at the top

b.       Documentation showing testing of the whole pipeline

2.       Each other individual uploads a document showing the testing of their individual code

 

Testing and documentation

-          On VITAL I have put a mini-FASTA file called uniprot_apicomplexa_mapk.fasta. For BLAST searching and extraction of gene names etc, use this FASTA file for testing. It contains all “MAPK” genes from Apicomplexan pathogen proteomes.

-          I have also uploaded the full FASTA files for all Apicomplexa (~230MB zipped). Only use this file, when you’re convinced your code is working on smaller examples.

-          As noted above, you will be expected to show how you have tested the routines you have developed using both correct (working) and incorrect (non-working) inputs. Where possible, you should show how you handle incorrect inputs, to give helpful error messages to the end user. Make sure you also document your code well.

 

 

Team responsibilities

1.       Team Leader

-          Team leader is responsible for developing any necessary code for retrieving details from FASTA files, for calling each step of the pipeline i.e. write controlling code that takes the first input and produces the final output, calling each step in turn.

-          Specifically you will need to process input from the user (at the command line) of three types listed above. If it is gene names or protein identifiers, you will need to extract these from the FASTA file, and make a new temporary BLAST file to pass to the BLAST step.

-          For top marks, you should pass a shared pdf object to functions in the other three parts, so that figures can be added to a multiple page pdf report, which you will control: following this example: http://matplotlib.org/examples/pylab_examples/multipage_pdf.html

-          If you cannot make this work, the fall back is to produce a text file for the user as the final output, telling the user where to find results in a variety of png and text files as appropriate

-          For top marks, you could add some extra plots to the report showing some extra statistics e.g. time taken to run each step, counts of hits at each step or similar at your discretion

 

2.       BLAST developer

-          The team leader will pass to your code a fasta file containing n protein sequences, for you to perform a BLAST search via the command line (note: don’t try this with 1000s of proteins, you may want to write code to limit the number of sequences to less than 10 say).

-          You should write code to process the results from each of the n searches, to extract proteins passing a user entered threshold e.g. e-value < E-10, and produce n fasta files to pass on to the next step (MSA).

-          For top marks, you should also produce one or two plots for each set of BLAST results written both as png files, and to the pdf object provided by the team leader (if they manage this part), showing for example a histogram of BLAST scores, nicely formatted alignment or other plot of your choice

 

3.       MSA developer

-          You will receive from the BLAST step n FASTA files, on which you should perform n clustalw runs to produce multiple sequence alignments.

o   Note I recommend using the following command to produce better trees for phylogenetics: “clustalw2 -INFILE=[inputfilename].fasta -ALIGN -TYPE=protein -CLUSTERING=UPGMA”

-          For each alignment, aim to produce a plot similar to the one below as png (and written to the pdf object provided by the team leader if possible), showing the percentage agreement in positions along the alignment length, with a figure legend.

 

-          For top marks, you also need to produce one other plot of some type showing some statistics or nicely formatted view of the alignment, with a figure legend

 

4.       Phylogenetics developer

-          From the MSA step, you will receive n “.dnd” files. You should reformat these to display species and gene names on the branches, adding suitable figure legend. You should work out how to colour branch lengths according to the different species in the results.

-          You should save n figures as png files, and ideally to the pdf object provided by the team leaders’ code

-          You should query pubmed over the internet to find any abstracts containing both the gene name and/or species name, and report back the pubmed ID, article details (author, journal, year, volume, pages) to a text file report.

-          For top marks, come up with an extra plot showing numbers of papers found per year or similar for a given query.

 

If anything is unclear, please email me.

 

Frequently Asked Questions

Is it free to get my assignment evaluated?

Yes. No hidden fees. You pay for the solution only, and all the explanations about how to run it are included in the price. It takes up to 24 hours to get a quote from an expert. In some cases, we can help you faster if an expert is available, but you should always order in advance to avoid the risks. You can place a new order here.

How much does it cost?

The cost depends on many factors: how far away the deadline is, how hard/big the task is, if it is code only or a report, etc. We try to give rough estimates here, but it is just for orientation (in USD):

Regular homework$20 - $150
Advanced homework$100 - $300
Group project or a report$200 - $500
Mid-term or final project$200 - $800
Live exam help$100 - $300
Full thesis$1000 - $3000

How do I pay?

Credit card or PayPal. You don't need to create/have a Payal account in order to pay by a credit card. Paypal offers you "buyer's protection" in case of any issues.

Why do I need to pay in advance?

We have no way to request money after we send you the solution. PayPal works as a middleman, which protects you in case of any disputes, so you should feel safe paying using PayPal.

Do you do essays?

No, unless it is a data analysis essay or report. This is because essays are very personal and it is easy to see when they are written by another person. This is not the case with math and programming.

Why there are no discounts?

It is because we don't want to lie - in such services no discount can be set in advance because we set the price knowing that there is a discount. For example, if we wanted to ask for $100, we could tell that the price is $200 and because you are special, we can do a 50% discount. It is the way all scam websites operate. We set honest prices instead, so there is no need for fake discounts.

Do you do live tutoring?

No, it is simply not how we operate. How often do you meet a great programmer who is also a great speaker? Rarely. It is why we encourage our experts to write down explanations instead of having a live call. It is often enough to get you started - analyzing and running the solutions is a big part of learning.

What happens if I am not satisfied with the solution?

Another expert will review the task, and if your claim is reasonable - we refund the payment and often block the freelancer from our platform. Because we are so harsh with our experts - the ones working with us are very trustworthy to deliver high-quality assignment solutions on time.

Customer Feedback

"Thanks for explanations after the assignment was already completed... Emily is such a nice tutor! "

Order #13073

Find Us On

soc fb soc insta


Paypal supported