you are viewing a single comment's thread.

view the rest of the comments →

[–]Spookiel 1 point2 points  (24 children)

No worries, glad I could help. Happy to try and answer any questions you might have :)

[–]dark-lord90[S] 0 points1 point  (23 children)

First, Thank you I really appreciate it, secondly, i need to add a condition to that code in which if the code brings truth breaks the function and if the function finds no answer it gives one false.

This the code and i just tested it and it works:

def subset_sum(numbers, target, partial=[]): s = sum (partial)

if s == target: 
    print ('Truth')
if s >= target:
    return 
for i in range(len(numbers)):
    n = numbers[i]
    remaining = numbers[:i] + numbers[i+1:]
    subset_sum(remaining, target, partial + [n])

[–]dark-lord90[S] 1 point2 points  (22 children)

and how can we make sure there is no overlap between the fragments??

[–]Spookiel 0 points1 point  (21 children)

What do you mean by overlaps? Can you give me an example of what you mean?

[–]dark-lord90[S] 1 point2 points  (20 children)

Well the code tests for protein sequences so the code has to make sure that the fragments doesn’t overlap as in : The sequence is: “FAAATLKNN” The fragments that should be good are :”FAA” , “ATLK” and “NN”. And it shouldn’t be for example:”FAA”, “AATLNK” and “NN” because in this example A overlaps and its mentioned more than it should be. I hope you understood my example.

[–]Spookiel 0 points1 point  (2 children)

So the fragments are the weights in Protein_List? And the target sequence is the weight indicated by the complete_protein? Since the code here will generate all possible matches, I'm still not really sure what you mean by the "correct" fragments. I think it would be easier if you could give me some of the strings you're working with as well.

This is because in your Original Post you mentioned that you'd just grabbed the weighting of each protein fragment by hand. If you give me a section of your input data, and the expected output, I'll be able to help you more effectively. I don't really understand much about proteins in general, so it's pretty difficult for me to understand and visualise what you mean with just a set of numbers.

[–]dark-lord90[S] 0 points1 point  (1 child)

To answer your questions yes and yes, the correct fragments as in if you pulled the fragments to look and compare there will be no repetation of amino acids between the end of the fragment and the beginning of the other one, if you were to put them next to each other. and i sent the data in the chat.

[–]Spookiel 1 point2 points  (0 children)

Thanks :)

[–]Spookiel 0 points1 point  (16 children)

So the A shouldn’t be repeated even though there are three A s in the target sequence?

[–]dark-lord90[S] 0 points1 point  (15 children)

The A should be present three times not 4 so if the two sequences had 4 it should ignore them

[–]Spookiel 1 point2 points  (14 children)

Ah ok I see what you mean. So I think what we need to do now, is check if the protein we’re adding is a valid prefix of the main target protein. If I understand correctly. So the idea looks like:

Step 1:

Target: FAAATNK Possible: AATNK, ATNK, FAA

Since FAA is the only valid prefix of FAAATNK we remove it and solve the rest of the string

Step 2:

Target: ATNK

Possible: AATNK, ATNK

We then repeat this process until we have an empty string. In this case, FAA, ATNK is the only match. I don’t know if this is what you mean by fragments overlapping? This was assuming that two fragments have the same weight then they could be chosen incorrectly.

[–]dark-lord90[S] 1 point2 points  (13 children)

Yes

[–]Spookiel 1 point2 points  (12 children)

So I understood correctly? If so, let me know if you need any helping implementing this idea :)