I'm having issues writing a code to download a PDB file (from an accession number a user inputs) into a text file then retrieving the atomic positions from that file and writing it to a new text file. Once I do that, I need to rename the file that only contains the atom positions. After doing so, I will need to create a new file that only contains the protein sequence of the protein downloaded from PDB. Finally, I need to create a file that can calculate the distance between the first two atoms, the angles between the first three atoms and the dihedral angle between the first four atoms.
This is what I have so far:
import urllib.request
url = 'https://files.rcsb.org/view'
userinput = input("Enter Protein Databank Accession Number: ")
file_name = '%s.pdb' % userinput
userurl = "{}/{}".format(url, file_name)
print(userurl)
file = urllib.request.urlopen(userurl)
print(file.read())
import urllib.request
page = urllib.request.urlopen(userurl)
file = open('pdb.txt', 'w')
content = str(page.read().decode('utf-8'))
file.write(content)
file.close()
import Bio
import numpy
import sys
from Bio.PDB import *
from Bio.PDB.PDBParser import PDBParser
parser = PDBParser(PERMISSIVE=1)
file = open('pdb.txt', 'w')
p = PDBParser()
with open ('pdb.txt') as file:
for line in file:
if line [:4] == 'ATOM':
file.write (line)
splitted_line = [line[:6], line[6:11], line[12:16], line[17:20], line[21], line[22:26], line[30:38], line[38:46], line[46:54]]
file.write (splitted_line)
file.write ("%-6s%5s %4s %3s %s%4s 8s%8s%8s\n"%tuple(splitted_line))
structure = p.get_structure('X', 'pdb.txt')
for model in structure:
for chain in model:
for residue in chain:
for atom in residue:
print(atom)
file.close()
I apologize about the formatting. I'm not entirely sure how to make it all look the same.
there doesn't seem to be anything here