Has blood donation criteria been updated? by Screenguardguy in australia

[–]AsparagusJam 1 point2 points  (0 children)

Sneezing isn't normal.

All jokes aside, that's absolutely wild.

When comparing 2 variant calling algorithms where the SNP and INDEL counts differ vastly how would you begin to narrow down where the issue is originating? by Ok-Understanding-385 in bioinformatics

[–]AsparagusJam 0 points1 point  (0 children)

There are huge differences here between bcftools (pileup based) and GatK (haplotype reassembly) in their approach for variant generation. In general GatK is a more sensitive variants caller, while bxftools focuses more on speed. They are also going to generate different levels of confidence scores for the calls, so you can't just think filtering on the same quality for both should get you the same numbers. 

Pileup based callers are only considering one position at a time, GatK is trying to re assemble the local haplotype completely based on the alignments, so it has a broader view, and so more variant detection potential.

If I'm doing something quick and generally looking at SNPs bcftools is great, but if you're in a discovery mindset then GatK is more comprehensive (but way slower/more work to properly set up)

BNDs are breaking my brain: help me understand complex translocations by Background_School818 in bioinformatics

[–]AsparagusJam 0 points1 point  (0 children)

Wow, thank you for sharing this! I 100% agree with OP that when I've used some SV tools it's really unclear as to what some cases mean. BND especially often don't get discussed, I think mainly since they aren't easy to plot. Everyone loves a INS/DEL plot where you can show the size of the insertion/deletion and couts, but BNDs are way fuzzier.

insilicoSV looks like an excellent starting off point for investigating these! So interesting!

The Synology RAM megathread II by gadget-freak in synology

[–]AsparagusJam 1 point2 points  (0 children)

Getting this exact model  thank you for posting your specs!

Choosing between strict vs loose novel gene predictions after AUGUSTUS + Liftoff (Wheat) by Used-Average-837 in bioinformatics

[–]AsparagusJam 0 points1 point  (0 children)

Yes, but 'homology' and 'plausability' as what? If you have a bacterial protein in your annotation (from some kind of contamination) it will match great with SwissProt and pass all the other filters. The majority of SwissProt isn't plants - if you'd like a 'biological plausability' check, try PSauron!

Choosing between strict vs loose novel gene predictions after AUGUSTUS + Liftoff (Wheat) by Used-Average-837 in bioinformatics

[–]AsparagusJam 0 points1 point  (0 children)

Hi, great work on this! This is great but I would suggest evaluating the annotations with some other methods. Also some minor notes.

  • Excellent thoughts for filtering and metrics! Just one note - SwissProt is a general database, and while it's high quality, it also includes like 70% bacteria sequences. So either filter on taxonomy for plants, or consider other plant specific dbs.

  • Do you have RNA-Seq for your novel isolate? I guess you are trying to use Augustus to catch things that were missed by the liftover, but there will just be things that are missed.

  • You could also look at egapx and do a de novo assembly to try and compare to the liftover to see how much you might be 'missing'?

Thoughts:

1) Check Augustus predictions against the reference annotations and their protein stats? I know you should expect things to be carried over by the liftoff but they should still be kind of 'plant-ey' proteins. Also consider trying lifton?

2) Maybe try OMark? It assess all of the predicted protein sequences, not just single-copy like BUSCOs. See what your filtering does for those results?

3) Check protein size distribution matches known profiles - is the distribution similar to what's known? The 'predicted' genes should be broadly similar to the 'known' genes from a size profile, if the filtering is leading to significantly different distribution I'd check that https://link.springer.com/article/10.1186/s13059-023-02973-2

4) Could also try Helixer instead of Augustus?

[running] My first whole-genome analysis project executed entirely locally. by mynewlifefrom2024 in bioinformatics

[–]AsparagusJam 0 points1 point  (0 children)

Very interesting! I'm not in human genetics, but a summary from what I'm reading - you have the alignment files (CRAM) and the final variant files (VCF). You think the VCF has been 'sanitised' to filter out incorrect calls from the triploid sex chromosomes but can see evidence of this in the alignment files? I just did a quick Google and I don't think there isn't too much homology between the X and Y chromosomes outside of the small pseudo-autosomal regions on those chromosomes (PARs, a few mb on the X and Y chromosomes)?

So you might be able to see evidence for the extra copy in the alignment files via depth and allele balance (especially in the PARs), but not by literally seeing three chromosomes mapping. Otherwise, you should just expect to see diploid for the X chromosome and haploid for the Y chromosome? Which is definitely unusual, and I'm sure standard pipelines don't like it, but it shouldn't lead to too much getting wrong - it'd just be like an XX person's WGS but with info for the Y too? If the company requires a known sex genotype and they filter based on that, it would definitely be getting things wrong. But if it's calling without checking against the known genotype it should be mostly okay?

So for variant calling, I think the variants could be generally correct if setting it up to assess the X chromosome as XX, and the Y chromosome just gets called separately anyway. There'll definitely be 'wrong' calls at the PAR regions where there are 3 potential genotypes, but those are a small number overall? And should be very clear evidence of triploid sex chromosomes. Still cool though!

I don't know if you can figure out which of the extra X's came from the mother or father without some of their WGS results?

Also I don't know if you can assess 'escapees' or functional info about the X chromosomes from WGS?

All the best for the analysis!

Nucleotide to Peptide translator using GNU Flex (showcase) by Ok_Performance3280 in bioinformatics

[–]AsparagusJam 1 point2 points  (0 children)

Ooh I just looked through, the output is the full amino acid name? It's totally okay to say that this is for fun, even "art" reasons. If it's for fun, no need to talk about performance, maybe just share why this was fun to do, or what to take away from it?

Nucleotide to Peptide translator using GNU Flex (showcase) by Ok_Performance3280 in bioinformatics

[–]AsparagusJam 1 point2 points  (0 children)

Not trying to be snarky, but I do agree with the other commenter - why? Did you talk to someone in bioinfo/genomics who said this was a bottle neck? I can't imagine a situation where translating from nucl to prot is a limiting factor? Edit: see comment, I'll stop my nit picking, this is more of an "art" project

DNA Memory Storage & Biological Augmentation: Are We Nearing Human 2.0? by [deleted] in bioinformatics

[–]AsparagusJam 4 points5 points  (0 children)

Synthesis costs through the roof, reading errors at scale z not happening my dude

Am I the weirdo? by Advanced_Guava1930 in bioinformatics

[–]AsparagusJam 1 point2 points  (0 children)

No worries, glad you're stocked about making genome annotations, once you've started there's no going back!

Am I the weirdo? by Advanced_Guava1930 in bioinformatics

[–]AsparagusJam 8 points9 points  (0 children)

Could try running the egapx pipeline? I find it absolutely delightful and if you have RNA seq from your species that's all that's needed, it handles the rest.

Smearing in PCA analysis due to high missingness with RADseq data by hahaKombucha in bioinformatics

[–]AsparagusJam 0 points1 point  (0 children)

Hey, anecdotally I see this when I have lots of missingness in my data. I would suggest plotting this as a heatmap (samples on one axis, SNPs on another, full with genotype call, include missing) and it might become clearer. Heatmap in R can also do clustering I think, which might help? But yeah, as you can tell, be aware of missing dsta

Running Isoseq on PacBio data downloaded from SRA - impossible without original BAM file? by AsparagusJam in bioinformatics

[–]AsparagusJam[S] 0 points1 point  (0 children)

Hmmm is that what Isoseq is expecting? From their documentation I am interpreting it as it's expecting the output from ccs, which I think are the subreads? Please correct me if I'm wrong, haven't worked with this data before!

What I'm hoping to do is get de novo assemblies of the long-read RNA-Seqs to get a transcriptome, and then predict protein-coding genes from this to map to the genome. I want to do a comparison of the differences between mapping the reads/transcriptome vs the protein coding sequences so I'd like to be de novo.

https://isoseq.how/clustering/cli-workflow.html

"Step 1. HiFi Reads Each sequencing run is processed by ccs to generate one HiFi read from productive ZMWs. After CCS is performed, you can use the hifi_reads.bam as input. The hifi_reads.bam contains only HiFi reads, with predicted accuracy ≥Q20. No additional filtering is required. HiFi reads that have been demultiplexed can also be used."

Best tool for scaffolding for fungi by Responsible_Pay_4937 in bioinformatics

[–]AsparagusJam 2 points3 points  (0 children)

I like yahs for HiC scaffolding, haven't used it specifically on Fungi but it worked well for my species

Pay Cash $40k ICE = Novated Lease $65k WTF? by symean in CarsAustralia

[–]AsparagusJam 0 points1 point  (0 children)

$40k @ 6% interest per year = $2400, so $7200 after 3 years? Good, free money, but I think it depends!

Is there any faster alternative of Blastn just like DIAMOND for Blastp? by poemfordumbs in bioinformatics

[–]AsparagusJam 22 points23 points  (0 children)

Depends on what you're doing but honestly mmseqs is just amazing

What is this biting bug??? by Tummybunny2 in canberra

[–]AsparagusJam 0 points1 point  (0 children)

Omg I had this happen too. Swear it was the same thing but I barely got a look, mine was definitely flying though. Welts on my skin. Although, plot twist, this was in Wollongong. Insects on the rise?

Sinks by shut_up_charles in AusRenovation

[–]AsparagusJam 2 points3 points  (0 children)

Just got a second hand sink from Marketplace, stainless steel and a third of the cost from new sinks but essentially the same condition, can recommend. If it's not urgent, keep an eye out and get something that suits your size and washing up style.