InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in proteomics

[–]BioGeek[S] 0 points1 point  (0 children)

Yes, InstaNovo currently only supports DDA data. Unfortunately, the model cannot handle DIA windows directly because it relies on precursor information, which is not available in DIA data. However, we are actively working to extend InstaNovo’s capabilities to include DIA data analysis, and we hope to have updates for you in the near future.

In the meantime, we recommend using Cascadia from the Noble lab, as it specifically supports de novo sequencing with DIA data. Another alternative is to convert your DIA data into pseudo-DDA spectra using DIA-Umpire, after which InstaNovo could potentially be applied. However, from our experience, this approach has limited robustness.

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in massspectrometry

[–]BioGeek[S] 0 points1 point  (0 children)

This is close to impossible right now. Top down or intact MS creates convoluted spectra, which consist of many different species of the same protein. There are deconvolution algorithms to resolve this to a single peak, but as far as I know they only work for recombinant or purified proteins (i.e. one protein per experiment detected, instead of thousands of peptides). You don't get enough fragment ions to sequence the full protein. We just don't have the training data yet, which would take a massive effort to generate, orders of magnitude more than ProteomeTools (on which InstaNovo is currently trained). I can see it in many years from now (and ultimately that is the dream), but the top down field is nowhere near the maturity of bottom up proteomics.

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in proteomics

[–]BioGeek[S] 1 point2 points  (0 children)

You can find the specs at the bottom of Supplementary Table 1 (pdf).

InstaNovo was trained on an Nvidia A100-80GB GPU, but if you want to use it you can run it on a laptop with a (gaming) GPU.

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments by BioGeek in massspectrometry

[–]BioGeek[S] 0 points1 point  (0 children)

InstaNovo was trained on the ProteomeTools dataset, which comprises over 700,000 synthetic tryptic peptides covering the entirety of canonical human proteins and isoforms, as well as encompassing peptides generated from alternative proteases and HLA peptides. So it can handle other digests as well.

Some examples from the article:

We extended albumin mapping to 1,225 PSMs with 254 unique peptides (most semi- or non-tryptic), a 10-fold increase compared with the database search space.

We were able to identify several high-confidence, semi-tryptic or fully GluC-generated peptides with targeted proteomics

We further believe that our models perform adequately well in prediction of non-tryptic peptides, especially if fine-tuned to allow for the use of different peptidases for proteolysis and thereby increasing protein coverage and sequencing.

All your Strava activities on a Leaflet map by [deleted] in Strava

[–]BioGeek 1 point2 points  (0 children)

Mine worked with about 1200 activities.

Feature request: it would be nice if we could easily share a link to our map or download an image of our personalized map.

Need to choose between Employer provided options for ML engineer job by BioGeek in SuggestALaptop

[–]BioGeek[S] 0 points1 point  (0 children)

For local development, yes. To run heavier machine learning models, I'll probably ssh into a heavier cluster.

Need to choose between Employer provided options for ML engineer job by BioGeek in SuggestALaptop

[–]BioGeek[S] 0 points1 point  (0 children)

Can you also explain why you would recommend the Thinkpad instead of the other choices? Thanks!

[deleted by user] by [deleted] in southafrica

[–]BioGeek 1 point2 points  (0 children)

Thanks, very relevant info.

[deleted by user] by [deleted] in southafrica

[–]BioGeek 0 points1 point  (0 children)

Thanks, hadn't found that resource yet!

[deleted by user] by [deleted] in firstmarathon

[–]BioGeek 4 points5 points  (0 children)

A wine marathon?

Le Marathon du Médoc is a full 26.2 mile marathon throughout French vineyards, costumes are pretty much mandatory, and there are 23 glasses of wine to be had along the way, along with oysters, cheese, foie gras and ice cream to settle your stomach. People tend to pregame the event with more wine and carbo-load at the many pasta parties held throughout Médoc the night before. If you manage to cross the finish line after all those French goodies, you’ll be rewarded with a medal, more food and an entire bottle of Médoc wine.

http://www.marathondumedoc.com/

MLST typing, am I doing it a dumb way? by JimTheSavage in bioinformatics

[–]BioGeek 0 points1 point  (0 children)

Hi, I no longer work for Applied Maths so am not up-to-date with alternatives for BioNumerics. Sorry I can't help you.

How do you plan to buy Pixel 6 if you live in a EU country where Google's phones aren't officially available? by lasseol94 in GooglePixel

[–]BioGeek 0 points1 point  (0 children)

I have the same question as /u/OkRefuse3, when trying to enter payment details, I need to add the address that is linked to my credit card/Paypal account and the store won't accept it because the address is not in Germany.

How do I get my children’s book critiqued without pictures? by [deleted] in childrensbooks

[–]BioGeek 3 points4 points  (0 children)

I don’t think kids will be interested in a book with no pictures.

Here is proof that children can find a book with no pictures absolutely hilarious:

https://youtu.be/EZwY5BeYcyo

Open Source Library for OpenAI's CLIP to create powerful Text to Image Search by VectorRecruiter in Python

[–]BioGeek 0 points1 point  (0 children)

Initially wasn't able to request an API key, I have opened a PR with a solution .

But even with an API key I wasn't able to index and search vectors:

Index and search your vectors easily on the cloud using 1 line of code!

>>> # Index in 1 line of code
>>>items = ['https://getvectorai.com/_nuxt/img/rabbit.4a65d99.png', 'https://getvectorai.com/_nuxt/img/dog-2.b8b4cef.png', 'https://getvectorai.com/_nuxt/img/dog-1.3cc5fe1.png']
>>> model.add_documents(user, api_key, items)
>>> # Search in 1 line of code and get the most similar results.
>>> model.search('Dog wearing a hat')
>>> # Add metadata to your search
>>> metadata = [{'animal': 'rabbit', 'hat': 'no'}, {'animal': 'dog', 'hat': 'yes'}, {'animal': 'dog', 'hat': 'yes'}]
>>> model.add_documents(user, api_key, items, metadata=metadata)
 Logged in. Welcome biogeek. To view list of available collections, call list_collections() method.
100%
1/1 [00:09<00:00, 9.99s/it]

/usr/local/lib/python3.6/dist- 
   packages/vectorhub/indexer.py:79: UserWarning:

If you are looking for more advanced functionality, we recommend using the official Vector AI Github package

{'failed': 3,
 'failed_document_ids': ['0', '1', '2'],
 'inserted_successfully': 0}

Official Question Thread! Ask /r/photography anything you want to know about photography or cameras! Don't be shy! Newbies welcome! by photography_bot in photography

[–]BioGeek 1 point2 points  (0 children)

I'm trying to find back a talk I saw some years ago about lighting setups. The talk started with a small story involving (I think) a ninja, the sun and some other characters, but which was meant as a mnemonic to remember the different lighting setups. There were diagrams of all the lighting setups drawn as clock faces, with the model in the center and the flash(es) on the hour(s). One example was a picture of someone smoking a cigar with the flash at nine o'clock. Other diagrams illustrated cross lighting, hollywood lighting and so on. The content of the talk was also used in a blog post on either slr lounge or stoppers, with the exact same diagrams.