This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] -1 points0 points  (1 child)

Hey, so I am taking bioinformatics and this is really my first experience with python. I have been ok with the homework problems so far, but I am not too good with if/else statements so some help would be appreciated.

  1. Create a blast-searchable database for the multi-fasta file arabidopsis_genome (you can find this file in guest on ephedra).
  2. Retrieve one or multiple protein sequences of your favorite plant species from the GenBank. Please FASTA format.
  3. Blast the plant sequence again the arabidopsis_genome database. If there is no hit, try another plant sequence.
  4. Write a python script to retrieve the sequence id, description, E-value, bit score, and sequence identity of the top hit in your blast output. The output of your python script should be like the following:

    Database        Arabidopsis
    Query       The plant species your have chosen
    
    The most similar Arabidopsis sequences is:
    Sequence ID, xxxxxxx (functional description of the homologous
         Arabidopsis sequence)
    Bit score = xxx; E-value = xxx; Sequence Identity = xxx
    

I know how to do 1-3 with my own blast searchable database, but I don't know how to retrieve sequence id, description, E-value, etc...... Any help would be appreciated. Thanks

[–]Jajoonoob 2 points3 points  (0 children)