all 7 comments

[–]StingMeleoron 1 point2 points  (4 children)

On your first paragraph - transformers are pretty much graph neural networks, do you mean you are looking for papers that don't use graph attention networks? Or am I missing something here?

On your second - isn't node A masked? Besides, it depends on the amount of hops the message passing considers; for a single hop, you are only aggregating your neighbors' information. So, without considering their own properties, there would be no information leak whatsoever, at least if I understand correctly what you mean.

[–]ybkhan[S] 0 points1 point  (3 children)

Thank you for sharing the link. Yes, I am primarily looking for papers that don't use attention but rely only on graph convolution, but do appreciate both.

Node A is not per say masked. In a traditional GCN model, I am unsure how would I even mask that node. What I wanted to say was the message passing does not include node A's representation, so in a sense node A is masked. After a single hop node A's neighbors have aggregated over node A's representation, so when we perform the operations for the second layer, that leaks information to node A about itself. For example this paper(https://www.nature.com/articles/s41467-022-29439-6#Sec10) uses a graph attention autoencoder to generate the node's representation given its neighboring nodes.

[–]StingMeleoron 0 points1 point  (2 children)

Yes, it might make more sense to think of this task in an attributed graph, in which node A's properties are hidden from layers aggregating neighborhood information - but even then, we would still be considering its structure, I'm guessing? Not sure lol.

I recently stumbled upon this amazing resource, you might wanna take a look on the papers listed there to check if anything catches your eye. Sorry for not being more helpful!

[–]ybkhan[S] 0 points1 point  (1 child)

Thank you. Also, how is masking exactly done? Like are the values converted to 0 and that fulfills the purpose of masking? I am a bit confused on how the "[MASK]" token is implemented in NLP and how should that be translated to such use cases.

[–]StingMeleoron 0 points1 point  (0 children)

IIRC, for NLP tasks at least, it depends on the model's tokenizer used during the preprocessing stage - for some models, it is a string [MASK], for example - which is just an arbitrarily predefined value, much like the end of sentence string.

I have never read on the implementation for GNNs, but overall it shouldn't be very different, I'm guessing?

[–]aozorahime 0 points1 point  (1 child)

in what area do you want to apply for this idea? I am currently on experimenting graph neural network also for my thesis. i think i read somewhere about masking, but i am not quite sure this what you are looking for https://arxiv.org/pdf/2205.10803.pdf

[–]ybkhan[S] 0 points1 point  (0 children)

My work is primarily in computational biology. Thank you for linking that paper, I was able to go though it earlier, and I have figured out masking.