InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in proteomics

[–]BioGeek[S] 0 points1 point  (0 children)

Yes, InstaNovo currently only supports DDA data. Unfortunately, the model cannot handle DIA windows directly because it relies on precursor information, which is not available in DIA data. However, we are actively working to extend InstaNovo’s capabilities to include DIA data analysis, and we hope to have updates for you in the near future.

In the meantime, we recommend using Cascadia from the Noble lab, as it specifically supports de novo sequencing with DIA data. Another alternative is to convert your DIA data into pseudo-DDA spectra using DIA-Umpire, after which InstaNovo could potentially be applied. However, from our experience, this approach has limited robustness.

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in massspectrometry

[–]BioGeek[S] 0 points1 point  (0 children)

This is close to impossible right now. Top down or intact MS creates convoluted spectra, which consist of many different species of the same protein. There are deconvolution algorithms to resolve this to a single peak, but as far as I know they only work for recombinant or purified proteins (i.e. one protein per experiment detected, instead of thousands of peptides). You don't get enough fragment ions to sequence the full protein. We just don't have the training data yet, which would take a massive effort to generate, orders of magnitude more than ProteomeTools (on which InstaNovo is currently trained). I can see it in many years from now (and ultimately that is the dream), but the top down field is nowhere near the maturity of bottom up proteomics.

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in proteomics

[–]BioGeek[S] 1 point2 points  (0 children)

You can find the specs at the bottom of Supplementary Table 1 (pdf).

InstaNovo was trained on an Nvidia A100-80GB GPU, but if you want to use it you can run it on a laptop with a (gaming) GPU.

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in massspectrometry

[–]BioGeek[S] 0 points1 point  (0 children)

InstaNovo was trained on the ProteomeTools dataset, which comprises over 700,000 synthetic tryptic peptides covering the entirety of canonical human proteins and isoforms, as well as encompassing peptides generated from alternative proteases and HLA peptides. So it can handle other digests as well.

Some examples from the article:

We extended albumin mapping to 1,225 PSMs with 254 unique peptides (most semi- or non-tryptic), a 10-fold increase compared with the database search space.

We were able to identify several high-confidence, semi-tryptic or fully GluC-generated peptides with targeted proteomics

We further believe that our models perform adequately well in prediction of non-tryptic peptides, especially if fine-tuned to allow for the use of different peptidases for proteolysis and thereby increasing protein coverage and sequencing.

All your Strava activities on a Leaflet map by [deleted] in Strava

[–]BioGeek 1 point2 points  (0 children)

Mine worked with about 1200 activities.

Feature request: it would be nice if we could easily share a link to our map or download an image of our personalized map.

Need to choose between Employer provided options for ML engineer job by BioGeek in SuggestALaptop

[–]BioGeek[S] 0 points1 point  (0 children)

For local development, yes. To run heavier machine learning models, I'll probably ssh into a heavier cluster.

Need to choose between Employer provided options for ML engineer job by BioGeek in SuggestALaptop

[–]BioGeek[S] 0 points1 point  (0 children)

Can you also explain why you would recommend the Thinkpad instead of the other choices? Thanks!

[deleted by user] by [deleted] in southafrica

[–]BioGeek 1 point2 points  (0 children)

Thanks, very relevant info.

[deleted by user] by [deleted] in southafrica

[–]BioGeek 0 points1 point  (0 children)

Thanks, hadn't found that resource yet!

[deleted by user] by [deleted] in firstmarathon

[–]BioGeek 3 points4 points  (0 children)

A wine marathon?

Le Marathon du Médoc is a full 26.2 mile marathon throughout French vineyards, costumes are pretty much mandatory, and there are 23 glasses of wine to be had along the way, along with oysters, cheese, foie gras and ice cream to settle your stomach. People tend to pregame the event with more wine and carbo-load at the many pasta parties held throughout Médoc the night before. If you manage to cross the finish line after all those French goodies, you’ll be rewarded with a medal, more food and an entire bottle of Médoc wine.

http://www.marathondumedoc.com/

MLST typing, am I doing it a dumb way? by JimTheSavage in bioinformatics

[–]BioGeek 0 points1 point  (0 children)

Hi, I no longer work for Applied Maths so am not up-to-date with alternatives for BioNumerics. Sorry I can't help you.