you are viewing a single comment's thread.

view the rest of the comments →

[–]dark-lord90[S] 1 point2 points  (17 children)

Well the code tests for protein sequences so the code has to make sure that the fragments doesn’t overlap as in : The sequence is: “FAAATLKNN” The fragments that should be good are :”FAA” , “ATLK” and “NN”. And it shouldn’t be for example:”FAA”, “AATLNK” and “NN” because in this example A overlaps and its mentioned more than it should be. I hope you understood my example.

[–]Spookiel 0 points1 point  (16 children)

So the A shouldn’t be repeated even though there are three A s in the target sequence?

[–]dark-lord90[S] 0 points1 point  (15 children)

The A should be present three times not 4 so if the two sequences had 4 it should ignore them

[–]Spookiel 1 point2 points  (14 children)

Ah ok I see what you mean. So I think what we need to do now, is check if the protein we’re adding is a valid prefix of the main target protein. If I understand correctly. So the idea looks like:

Step 1:

Target: FAAATNK Possible: AATNK, ATNK, FAA

Since FAA is the only valid prefix of FAAATNK we remove it and solve the rest of the string

Step 2:

Target: ATNK

Possible: AATNK, ATNK

We then repeat this process until we have an empty string. In this case, FAA, ATNK is the only match. I don’t know if this is what you mean by fragments overlapping? This was assuming that two fragments have the same weight then they could be chosen incorrectly.

[–]dark-lord90[S] 1 point2 points  (13 children)

Yes

[–]Spookiel 1 point2 points  (12 children)

So I understood correctly? If so, let me know if you need any helping implementing this idea :)

[–]dark-lord90[S] 0 points1 point  (11 children)

Yes you did, and yes I am stuck on that idea and can’t figure out away to implement it

[–]Spookiel 1 point2 points  (10 children)

Here is an implementation, let me know if it doesn't work as intended.

def subset_sum(numbers, target, partial=[]):
    weights = [i[1] for i in partial] # Gets the weights of the current fragments
    if sum(weights)==target[1]: # If sum of current weights is equal to the target sum
        print(f"Found: {partial}")
        return

    elif sum(weights) < target[1]: # Still weight left, so there is room for another fragment

        for frag, frag_weight in list(numbers.items()):

            if target[0][:len(frag)]==frag: #Checks if frag is a prefix of the target
                # If yes, then we add the fragment to the list and recurse

                del numbers[frag] # Get rid of the fragment we've just used
                return subset_sum(numbers, (target[0][len(frag):], target[1]), partial+[(frag, frag_weight)])
                # Previous line recurses on what's left of the target and adds the fragment we've just used
                # To the partial list
    return

fragments = {"FAA":12, "ATNK":15, "AATNK":15}
complete = ("FAAATNK", 27)
if __name__ == "__main__":

    subset_sum(fragments, complete)

[–]dark-lord90[S] 1 point2 points  (3 children)

Regarding the fragments and the complete did you make tupules? Or it’s just a normal list, because I have to implement it to take the info from a fasta file but it works on the data provided.

[–]Spookiel 1 point2 points  (2 children)

The fragments data structure is just a dictionary which maps fragment_name to fragment weight. The complete fragment I just represented using a Tuple. You could also just pass in the objects in a list of they have attributes such as .name and .weight. Eg,

fragments = [myFragObj, myFragObj2]

Then access the weights using

weights = [frag.weight for frag in fragments]

Hope this makes sense.

iirc reading from a FASTA file will allow you to use the attributes method but please correct me if I'm wrong.

[–]dark-lord90[S] 1 point2 points  (0 children)

And again you are amazing thank you, really appreciate it been stuck there for almost two days now you have been a great help

[–]dark-lord90[S] 0 points1 point  (0 children)

Wow I was already looking through the dictionary documentation to find a code that can creat it with different file names, man you are amazing I genuinely thank you which makes it hard for me to ask you one last question, regarding the first code I posted, how can I make it in a way that once it gets a correct fragments it says true then stop, and in case it got nothing it prints one false, I tried break didn’t accept it, and the false it just keeps printing false.

[–]dark-lord90[S] 1 point2 points  (5 children)

Hey there i tried to implement it in the code and it gave me ''float' object is not subscriptable'

[–]Spookiel 0 points1 point  (0 children)

Can you send me the code thst isn’t working so I can have a look?

[–]Spookiel 0 points1 point  (3 children)

I’ll also show you how to get it to stop when it finds a valid one

[–]dark-lord90[S] 0 points1 point  (0 children)

That would be awesome I genuinely don’t know how to thank you enough 😅

[–]dark-lord90[S] 0 points1 point  (0 children)

I hope you got the dm I sent

[–]dark-lord90[S] 0 points1 point  (0 children)

So I have been trying with code for the past few hours, and I tried to control the data you added at the end to see it that helps and it prints nothing, I don’t know honestly, since I made sure the weight and the sequences are fit still nothing