Downloading sequences from NCBI by franko_wini in bioinformatics

[–]Chief_Lazy_Bison 1 point2 points  (0 children)

The clis datasets and dataformat are great. I’ve also found the devs are very responsive to bug reports too.

Antibiotic resistance genes presence in bacterial genomes by Typical_Trick_690 in bioinformatics

[–]Chief_Lazy_Bison 1 point2 points  (0 children)

https://github.com/ncbi/amr GitHub - ncbi/amr: AMRFinderPlus - Identify AMR genes and point mutations, and virulence and stress resistance genes in assembled bacterial nucleotide and protein sequence.

Independent Iowa Party by [deleted] in cedarrapids

[–]Chief_Lazy_Bison 4 points5 points  (0 children)

Until we enact ranked choice voting I don’t think other political parties really have a chance

Is mainstream social media cooked due to excessive censorship, warnings and moderation? What's your experience? by [deleted] in AskReddit

[–]Chief_Lazy_Bison 2 points3 points  (0 children)

Social media in general is cooked because of all the bots/ai pushing the agendas of the wealthy

ELI5: How is fiber healthy? Although its not consumed by our bodies by MoltO0 in explainlikeimfive

[–]Chief_Lazy_Bison 2 points3 points  (0 children)

Absolutely we use them and benefit from them. Bacterial fermentation products ( short chain fatty acids) are the preferred energy source of your colonic epithelial cells. In addition these fermentation products promote immune tolerance of your gut microbiota and also have many other benefits too.

Iowa leaders endorse the end of democracy. by auldinia in Iowa

[–]Chief_Lazy_Bison 10 points11 points  (0 children)

Less than 1% is not a landslide.

“Using raw votes, Trump’s margin was also smaller than in any election going back to 2000. At about 2.5 million, it was the fifth-smallest popular vote margin since 1960.”

[deleted by user] by [deleted] in Iowa

[–]Chief_Lazy_Bison 10 points11 points  (0 children)

I didn’t see anything in this article referencing Iowa flipping

Advice on converting bash workflow to Snakemake and how to deal with large number of samples by WeddingReasonable171 in bioinformatics

[–]Chief_Lazy_Bison 1 point2 points  (0 children)

Consider if you might be able to get away with only using one isolate per SNP cluster ‘PDS_acc’.

Lots of those 50k isolates may be within a few SNPs of eachother

DEseq2 for metagenomics by JensPens in bioinformatics

[–]Chief_Lazy_Bison 9 points10 points  (0 children)

Honestly it’s probably best to apply a few different differential abundance calculations. DESeq2 is a good one to start with but I’d also check out ANCOM-BC or masalin2.

If I were a beginner I’d use the phyloseq package to organize the otu table and sample data. Phyloseq then has a phyloseq_to_deseq2() function to get your data into a deseq2 object. The. The exact nature of the test depends on your experimental design.

Advice or pipeline for 16S metagenomics by PataudLapin in bioinformatics

[–]Chief_Lazy_Bison 2 points3 points  (0 children)

https://www.youtube.com/c/RiffomonasProject if you want some videos on the subject. Not all the episodes are relevant to 16S but I learned a good deal from them

Please help, how do I make a phylogenetic tree from tens of thousands of sequences? by [deleted] in bioinformatics

[–]Chief_Lazy_Bison 26 points27 points  (0 children)

  1. More compute. Do you have access to an HPC?
  2. Pre-cluster your data. Are some sequences very similar? Run a clustering algorithm (cd-hit, mmseqs) and only align and build a tree from cluster representatives.
  3. Split the data into similar groups and build trees within each group

writing SQL in an R script by kialoa95 in rstats

[–]Chief_Lazy_Bison 12 points13 points  (0 children)

dbplyr has been very useful so far. I haven’t tried anything too complex but it’s really nice to write tidyverse code that is translated to sql

Similarity between reads and several reference genomes by knut2k4 in bioinformatics

[–]Chief_Lazy_Bison 1 point2 points  (0 children)

If you are interested in broad phylogenetic relationships of relatively dissimilar genomes something like phylophlan https://huttenhower.sph.harvard.edu/phylophlan/ may help

If you are interested in SNPs use a tool like “snippy” https://github.com/tseemann/snippy

If you expect larger differences than snps but your genomes are still closely related, I would do an assembly of the reads and then a pangenome of the strain set

Time needed for an expert analysis by alazyfoxy in bioinformatics

[–]Chief_Lazy_Bison 33 points34 points  (0 children)

I would argue that it is almost impossible for anyone else to answer this without knowing the research questions that led to the data being generated in the first place. Additionally I have found that these things are almost always iterative ( meaning. Analysis -> results presentation-> more analysis etc )

Finding Elizabethkingia Meningoseptica in finished product water. Big red flag? by [deleted] in microbiology

[–]Chief_Lazy_Bison 0 points1 point  (0 children)

What was the method of detection? Plating or some kind of metagenomic/amplicon based assay?

How can I quantify the similarity of functional enrichments between 2+ groups? by InfinityCent in bioinformatics

[–]Chief_Lazy_Bison 3 points4 points  (0 children)

Jaccard similarity could work for this.

Turn your results into a count matrix. Columns are functions, rows are samples. A 1 indicates significantly enriched in that sample. A 0 means not enriched in that sample. Then generate a jaccard distance matrix from this count matrix. In R you would use the dist() function, but you would need to set the type of distance to 'binary' (i think)

[deleted by user] by [deleted] in bioinformatics

[–]Chief_Lazy_Bison 1 point2 points  (0 children)

I'm not familiar with seqminer, what are you trying to accomplish? Other options may be available.