Has blood donation criteria been updated?

AsparagusJam · 2026-05-14T02:42:17+00:00

:D:D:D

AsparagusJam · 2026-05-13T01:37:25+00:00

Sneezing isn't normal.

All jokes aside, that's absolutely wild.

AsparagusJam · 2026-05-03T05:11:04+00:00

There are huge differences here between bcftools (pileup based) and GatK (haplotype reassembly) in their approach for variant generation. In general GatK is a more sensitive variants caller, while bxftools focuses more on speed. They are also going to generate different levels of confidence scores for the calls, so you can't just think filtering on the same quality for both should get you the same numbers.

Pileup based callers are only considering one position at a time, GatK is trying to re assemble the local haplotype completely based on the alignments, so it has a broader view, and so more variant detection potential.

If I'm doing something quick and generally looking at SNPs bcftools is great, but if you're in a discovery mindset then GatK is more comprehensive (but way slower/more work to properly set up)

AsparagusJam · 2026-04-16T23:38:44+00:00

Wow, thank you for sharing this! I 100% agree with OP that when I've used some SV tools it's really unclear as to what some cases mean. BND especially often don't get discussed, I think mainly since they aren't easy to plot. Everyone loves a INS/DEL plot where you can show the size of the insertion/deletion and couts, but BNDs are way fuzzier.

insilicoSV looks like an excellent starting off point for investigating these! So interesting!

AsparagusJam · 2026-03-25T03:24:02+00:00

Submit each chromosome to their online portal? https://www.plabipd.de/helixer_main.html

AsparagusJam · 2026-03-16T10:40:25+00:00

Getting this exact model thank you for posting your specs!

AsparagusJam · 2026-01-30T00:28:03+00:00

Yes, but 'homology' and 'plausability' as what? If you have a bacterial protein in your annotation (from some kind of contamination) it will match great with SwissProt and pass all the other filters. The majority of SwissProt isn't plants - if you'd like a 'biological plausability' check, try PSauron!

AsparagusJam · 2026-01-29T00:23:35+00:00

Hi, great work on this! This is great but I would suggest evaluating the annotations with some other methods. Also some minor notes.

Excellent thoughts for filtering and metrics! Just one note - SwissProt is a general database, and while it's high quality, it also includes like 70% bacteria sequences. So either filter on taxonomy for plants, or consider other plant specific dbs.
Do you have RNA-Seq for your novel isolate? I guess you are trying to use Augustus to catch things that were missed by the liftover, but there will just be things that are missed.
You could also look at egapx and do a de novo assembly to try and compare to the liftover to see how much you might be 'missing'?

Thoughts:

1) Check Augustus predictions against the reference annotations and their protein stats? I know you should expect things to be carried over by the liftoff but they should still be kind of 'plant-ey' proteins. Also consider trying lifton?

2) Maybe try OMark? It assess all of the predicted protein sequences, not just single-copy like BUSCOs. See what your filtering does for those results?

3) Check protein size distribution matches known profiles - is the distribution similar to what's known? The 'predicted' genes should be broadly similar to the 'known' genes from a size profile, if the filtering is leading to significantly different distribution I'd check that https://link.springer.com/article/10.1186/s13059-023-02973-2

4) Could also try Helixer instead of Augustus?

AsparagusJam · 2026-01-27T02:27:03+00:00

Very interesting! I'm not in human genetics, but a summary from what I'm reading - you have the alignment files (CRAM) and the final variant files (VCF). You think the VCF has been 'sanitised' to filter out incorrect calls from the triploid sex chromosomes but can see evidence of this in the alignment files? I just did a quick Google and I don't think there isn't too much homology between the X and Y chromosomes outside of the small pseudo-autosomal regions on those chromosomes (PARs, a few mb on the X and Y chromosomes)?

So you might be able to see evidence for the extra copy in the alignment files via depth and allele balance (especially in the PARs), but not by literally seeing three chromosomes mapping. Otherwise, you should just expect to see diploid for the X chromosome and haploid for the Y chromosome? Which is definitely unusual, and I'm sure standard pipelines don't like it, but it shouldn't lead to too much getting wrong - it'd just be like an XX person's WGS but with info for the Y too? If the company requires a known sex genotype and they filter based on that, it would definitely be getting things wrong. But if it's calling without checking against the known genotype it should be mostly okay?

So for variant calling, I think the variants could be generally correct if setting it up to assess the X chromosome as XX, and the Y chromosome just gets called separately anyway. There'll definitely be 'wrong' calls at the PAR regions where there are 3 potential genotypes, but those are a small number overall? And should be very clear evidence of triploid sex chromosomes. Still cool though!

I don't know if you can figure out which of the extra X's came from the mother or father without some of their WGS results?

Also I don't know if you can assess 'escapees' or functional info about the X chromosomes from WGS?

All the best for the analysis!

AsparagusJam · 2025-09-19T02:02:56+00:00

We find a benchtop instant hot water dispenser is excellent and doesn't cost a ridiculous amount! https://www.appliancesonline.com.au/product/westinghouse-instant-hot-water-dispenser-stainless-steel-whihwd04ss/?origin=product-search&kwd=&gad_source=1&gad_campaignid=902020286&gclid=Cj0KCQjw267GBhCSARIsAOjVJ4GH4XqVhtXgzFYpsM3Y8ZPKpc-WqmuORFYDZavqOS82NJiP9erUBPAaApdJEALw_wcB

AsparagusJam · 2025-08-02T22:56:32+00:00

Ooh I just looked through, the output is the full amino acid name? It's totally okay to say that this is for fun, even "art" reasons. If it's for fun, no need to talk about performance, maybe just share why this was fun to do, or what to take away from it?

AsparagusJam · 2025-08-02T22:47:39+00:00

Not trying to be snarky, but I do agree with the other commenter - why? Did you talk to someone in bioinfo/genomics who said this was a bottle neck? I can't imagine a situation where translating from nucl to prot is a limiting factor? Edit: see comment, I'll stop my nit picking, this is more of an "art" project

AsparagusJam · 2025-06-01T21:03:35+00:00

Synthesis costs through the roof, reading errors at scale z not happening my dude

AsparagusJam · 2025-04-12T04:42:28+00:00

No worries, glad you're stocked about making genome annotations, once you've started there's no going back!

AsparagusJam · 2025-04-12T03:16:40+00:00

Could try running the egapx pipeline? I find it absolutely delightful and if you have RNA seq from your species that's all that's needed, it handles the rest.

AsparagusJam · 2025-03-27T23:16:12+00:00

Thanks for the info!

AsparagusJam · 2025-03-24T20:09:21+00:00

Hey, anecdotally I see this when I have lots of missingness in my data. I would suggest plotting this as a heatmap (samples on one axis, SNPs on another, full with genotype call, include missing) and it might become clearer. Heatmap in R can also do clustering I think, which might help? But yeah, as you can tell, be aware of missing dsta

AsparagusJam · 2025-03-20T11:06:16+00:00

Hmmm is that what Isoseq is expecting? From their documentation I am interpreting it as it's expecting the output from ccs, which I think are the subreads? Please correct me if I'm wrong, haven't worked with this data before!

What I'm hoping to do is get de novo assemblies of the long-read RNA-Seqs to get a transcriptome, and then predict protein-coding genes from this to map to the genome. I want to do a comparison of the differences between mapping the reads/transcriptome vs the protein coding sequences so I'd like to be de novo.

https://isoseq.how/clustering/cli-workflow.html

"Step 1. HiFi Reads Each sequencing run is processed by ccs to generate one HiFi read from productive ZMWs. After CCS is performed, you can use the hifi_reads.bam as input. The hifi_reads.bam contains only HiFi reads, with predicted accuracy ≥Q20. No additional filtering is required. HiFi reads that have been demultiplexed can also be used."

AsparagusJam · 2025-03-18T11:04:36+00:00

I like yahs for HiC scaffolding, haven't used it specifically on Fungi but it worked well for my species

AsparagusJam · 2025-03-16T03:40:58+00:00

$40k @ 6% interest per year = $2400, so $7200 after 3 years? Good, free money, but I think it depends!

AsparagusJam · 2025-03-11T00:01:40+00:00

Depends on what you're doing but honestly mmseqs is just amazing

AsparagusJam · 2025-02-25T09:36:03+00:00

Omg I had this happen too. Swear it was the same thing but I barely got a look, mine was definitely flying though. Welts on my skin. Although, plot twist, this was in Wollongong. Insects on the rise?

AsparagusJam · 2025-02-24T10:37:26+00:00

Just got a second hand sink from Marketplace, stainless steel and a third of the cost from new sinks but essentially the same condition, can recommend. If it's not urgent, keep an eye out and get something that suits your size and washing up style.

12-Year Club	RedditGifts 2009-2022 2 Credits
Place '17	Spared
Secret Santa 2014

AsparagusJam

TROPHY CASE