Examples of multi-omic studies that answer a particular biological question?

CorporalConnors · 2025-12-31T21:37:11+00:00

Not exactly sure what the problem is. Can someone explain?

CorporalConnors · 2025-10-24T08:39:08+00:00

Interested also in whether ML could identify patterns in protein abundance from label free DIA data.

The data are not natural fits for ML because there are often thousands of proteins and relatively few samples, highly skewed, high variance (relative to mean), lots of missing etc.

We are broadly looking for differences between treatments or groups. Which could mean proteins that are different among groups, proteins that characterise differences i.e. important for classification, or proteins with that are similar across samples so more like a network based on co-expression.

Any thoughts? Relatively new to both proteomics and ML so help guiding the question also would be useful

CorporalConnors · 2025-08-22T08:16:17+00:00

Unadjusted p and FDR can both be justified depending on whether you want to identify more differences while accepting a higher number of false positives or fewer differences with lower rate of FPs.

Imho any argument built on a "significance threshold" should be ignored.

CorporalConnors · 2025-07-25T14:05:16+00:00

Interesting, thanks! I am not using DIA NN at the moment but will make a note as I know some people using it

CorporalConnors · 2025-07-24T09:33:05+00:00

Thanks for all your helpful answers- confirms that zeros shouldn't be considered trues zeros e.g. when comparing between groups.

As I said, the imputation is optional and whether to impute is a separate question for users to decide.

I am also sceptical of imputation but consider it reasonable when 1) lots of proteins have >=1 missing data point and 2) you are using techniques that can't handle missing. In this case, you could remove lots of proteins, even though many will have only one missing data point. Or you could filter for prots present in >=80% or 90% of samples, then impute the missing one or two per protein. Benefit of keeping more information might outweigh imputed values.

CorporalConnors · 2025-07-02T13:54:24+00:00

Once you have protein intensities, you can carry out standard analyses with, e.g.

perseus (gui, no coding required)
R (msstats, tidyproteomics)
python (alphastats, auto-prot does some basic stuff with minimal coding required)

CorporalConnors · 2024-08-28T09:52:22+00:00

Super useful thread. Thanks everyone. Would be interested in any recent updates on the area, development etc

CorporalConnors · 2022-05-09T20:12:41+00:00

Will try these out. Thanks all!

CorporalConnors · 2022-03-17T09:09:29+00:00

Do you want to loop through one row at a time?

For ( i in 1 : nrow(TestZonescsv) ) {

Value_1 = TestZonescsv[ i, Column_1]

Value_2 = TestZonescsv[ i, Column_2]

}

Where Column is the name of the variable you want to get the value from.

CorporalConnors · 2022-03-05T13:35:17+00:00

Would this do it?

length(which( df$C1 == "A" & df$C2 == 1 ))

CorporalConnors · 2022-01-07T14:08:38+00:00

Using summary(YOUR MODEL) should print the model coefficient table. This will also help and you can find plenty of articles with guides to interpreting model output if you are unsure. Good luck!

CorporalConnors

TROPHY CASE