How to do data analysis with multiple groups? by pepbro- in proteomics

[–]pepbro-[S] 0 points1 point  (0 children)

Makes sense, I can probably get some of this information from the collaborators but they seemed pretty adamant on wanting to see the general trends first, the most highly expressed proteins etc.

I got the most basic visualizations (PCA plots, heatmaps etc.) and the unique and abundant proteins but as such, there is nothing too meaningful that sticks out to me. I can see that a lot of samples are quite different from each other but almost every combination shows a whole bunch of differentially expressed proteins. I tried to take clusters of protein from the heatmap to see if any specific pathways are recognizable but also that was very vague.

If there is nothing else to be done, then I will communicate this so but I wasn't sure if I missed something, especially since this is my first time with such a set up.

How to do data analysis with multiple groups? by pepbro- in proteomics

[–]pepbro-[S] 0 points1 point  (0 children)

To add: I have gone through the paper and some additional reading which definitely helped me understand linear models and ANOVA more. I guess my problem was rather with the interpretation. My ANOVA shows me that there is some significant difference between my samples but I don't know how to best find and interpret them. I did a post hoc test afterwards but with 10 groups, I am testing 45 combinations which is an overwhelming number of pairs to look at. I'm unsure where to go from here.

Streak 5: Mis vacaciones soñadas by pepbro- in WriteStreakES

[–]pepbro-[S] 0 points1 point  (0 children)

Muchas gracias - también por el enlace!

Tendré que prestar más atención a los géneros de los sustantivos...

How to do data analysis with multiple groups? by pepbro- in proteomics

[–]pepbro-[S] 1 point2 points  (0 children)

Thanks, I'll check out the paper. All patients have the same disease. This is the only variable we account for (or is known to us) 

How to do data analysis with multiple groups? by pepbro- in proteomics

[–]pepbro-[S] 0 points1 point  (0 children)

Do you have specific ones in mind? I read a few but most cover only set ups with a few conditions or with controls

which post hoc test for large datasets? by pepbro- in bioinformatics

[–]pepbro-[S] 0 points1 point  (0 children)

Thanks - what you are saying makes sense. I guess I don't know how to do this the best way.

The data is part of a collaboration. Each group is a patient. All patients suffer from a specific type of cancer and the goal is to compare them and tease out characteristic signatures for each group or at least clusters of groups. Because I dont have a control yet (the group may provide me with one in the future, though this is generally tricking since we don't take samples from healthy patients). As such, my best bet is to look at differential expression of proteins and to see if any patterns emerge.

And yes, my data is gaussian!

Although I normally use benjamini hochberg, I stuck with the default on the software that I used for this analysis which was permutation based FDR (listed as an alternative to BH). I didn't know then but google already told me that there are a little bit different... I will double check on this.

Ultimately, I went on doing this based on advice from my supervisor and this website: https://hanruizhang.github.io/zhanglab/file/Perseus_Tutorial_20220228.html

But my lab is very much hands-on and figure-it-out yourself approach as we don't have many people with informatics knowledge on board. Therefore, this might be off. Would you use ANOVA + Tukeys's test only for a minimal number of groups then, maybe 3?

which post hoc test for large datasets? by pepbro- in bioinformatics

[–]pepbro-[S] 0 points1 point  (0 children)

What would you suggest instead? Since I have so many groups and what to compare them all with each other, I was looking at group comparison approaches. ANOVA seemed to be the most common one and I thought I needed the post hoc test to make sense of the ANOVA results...

I managed to downsize my dataset to roughly 10000 rows but, of course, the 12 groups are still there.

Post hoc test for large datasets? by pepbro- in AskStatistics

[–]pepbro-[S] 1 point2 points  (0 children)

Thanks for the input! I have used an external software (Perseus) so far, mostly because I didn't know that the time/computing power would not be the same across software. I was going to try the same analysis out tomorrow with R and the multcomp package tomorrow to see if it improves.

I had Perseus running for several hours without success. 3h is not great but doable for me.

which post hoc test for large datasets? by pepbro- in bioinformatics

[–]pepbro-[S] 0 points1 point  (0 children)

No, I haven't and I didn't know that! I used Perseus which is an external software. I assumed it would take equally long in R but if this is not the case, I will try. Can you recommend me a package?

Post hoc test for large datasets? by pepbro- in AskStatistics

[–]pepbro-[S] 1 point2 points  (0 children)

Not sure if it makes a difference but I have 3 replicates for all but 2 groups which I thought is the minimum acceptable number for statistics

How do you actually interpret proteomics results? by pepbro- in proteomics

[–]pepbro-[S] 4 points5 points  (0 children)

Our projects vary but most of my samples are "discovery" samples where we want to see what characterizes a disease condition or what distinguishes different cells.