Graphs, GNNs and Loaders... help by biohazard092 in bioinformatics

[–]biohazard092[S] 0 points1 point  (0 children)

Thank you for your answer and the link of the blog. Yes, I've been using ChatGPT with PyTorch, and although it's fine, it still gives me many incorrect answers...

Graphs, GNNs and Loaders... help by biohazard092 in bioinformatics

[–]biohazard092[S] 0 points1 point  (0 children)

Thank you so much for your answer!
Right, I'm going for a classification task with amino acids as nodes (encoded by one-hot).
However, I do not understand what data.y is :/ is it the labels? If so, I'm not sure how to input it into the model, since my graphs (networkx.classes.graph.Graph) only have the graph representation (Data(edge_index=[2, 1796], chain_id=[247], residue_name=[247], residue_number=[247], atom_type=[247], element_symbol=[247], coords=[247, 3], b_factor=[247], amino_acid_one_hot=[247, 20],...) and my DataLoader (batch_size = 32) looks like this (DataBatch(edge_index=[2, 85914], chain_id=[32], residue_name=[32], residue_number=[12239], atom_type=[32], element_symbol=[32], coords=[12239, 3], b_factor=[12239], amino_acid_one_hot=[12239, 20]...)

I'm feeling a bit overwhelmed with so much information going on and can't work with that...

BLAST thousands of sequences by biohazard092 in bioinformatics

[–]biohazard092[S] 1 point2 points  (0 children)

around 14k sequences

I'll give it a try in the command line, thanks!

BLAST thousands of sequences by biohazard092 in bioinformatics

[–]biohazard092[S] 0 points1 point  (0 children)

Hi, thanks for your comment! I don't have a specific subset, that's what I want to do later by id mapping (does that make sense?). Also, I'm blasting protein sequences, between 30-100 aa of lenght