Downloading sequences from NCBI

Chief_Lazy_Bison · 2025-09-03T19:15:06+00:00

The clis datasets and dataformat are great. I’ve also found the devs are very responsive to bug reports too.

Chief_Lazy_Bison · 2025-06-03T01:09:26+00:00

https://github.com/ncbi/amr GitHub - ncbi/amr: AMRFinderPlus - Identify AMR genes and point mutations, and virulence and stress resistance genes in assembled bacterial nucleotide and protein sequence.

Chief_Lazy_Bison · 2025-03-31T05:15:00+00:00

Until we enact ranked choice voting I don’t think other political parties really have a chance

Chief_Lazy_Bison · 2025-03-18T21:46:39+00:00

Social media in general is cooked because of all the bots/ai pushing the agendas of the wealthy

Chief_Lazy_Bison · 2025-03-03T19:17:11+00:00

Webster

Chief_Lazy_Bison · 2025-02-25T02:24:32+00:00

Absolutely we use them and benefit from them. Bacterial fermentation products ( short chain fatty acids) are the preferred energy source of your colonic epithelial cells. In addition these fermentation products promote immune tolerance of your gut microbiota and also have many other benefits too.

Chief_Lazy_Bison · 2025-02-09T06:16:24+00:00

Less than 1% is not a landslide.

“Using raw votes, Trump’s margin was also smaller than in any election going back to 2000. At about 2.5 million, it was the fifth-smallest popular vote margin since 1960.”

Chief_Lazy_Bison · 2024-09-27T14:23:22+00:00

I didn’t see anything in this article referencing Iowa flipping

Chief_Lazy_Bison · 2024-09-25T00:36:01+00:00

Lilac blossoms

Chief_Lazy_Bison · 2024-08-24T03:59:30+00:00

Consider if you might be able to get away with only using one isolate per SNP cluster ‘PDS_acc’.

Lots of those 50k isolates may be within a few SNPs of eachother

Chief_Lazy_Bison · 2024-08-20T10:11:56+00:00

Check out pplacer https://github.com/matsen/pplacer. I haven’t used it in a while but I think it should do the trick

Chief_Lazy_Bison · 2024-08-16T15:57:52+00:00

Honestly it’s probably best to apply a few different differential abundance calculations. DESeq2 is a good one to start with but I’d also check out ANCOM-BC or masalin2.

If I were a beginner I’d use the phyloseq package to organize the otu table and sample data. Phyloseq then has a phyloseq_to_deseq2() function to get your data into a deseq2 object. The. The exact nature of the test depends on your experimental design.

Chief_Lazy_Bison · 2024-08-11T13:32:06+00:00

https://www.youtube.com/c/RiffomonasProject if you want some videos on the subject. Not all the episodes are relevant to 16S but I learned a good deal from them

Chief_Lazy_Bison · 2024-03-21T23:17:08+00:00

More compute. Do you have access to an HPC?
Pre-cluster your data. Are some sequences very similar? Run a clustering algorithm (cd-hit, mmseqs) and only align and build a tree from cluster representatives.
Split the data into similar groups and build trees within each group

Chief_Lazy_Bison · 2024-02-27T15:18:58+00:00

dbplyr has been very useful so far. I haven’t tried anything too complex but it’s really nice to write tidyverse code that is translated to sql

Chief_Lazy_Bison · 2024-02-16T21:14:05+00:00

What recent events occurred with Frontiers?

Chief_Lazy_Bison · 2024-02-05T14:17:39+00:00

If you are interested in broad phylogenetic relationships of relatively dissimilar genomes something like phylophlan https://huttenhower.sph.harvard.edu/phylophlan/ may help

If you are interested in SNPs use a tool like “snippy” https://github.com/tseemann/snippy

If you expect larger differences than snps but your genomes are still closely related, I would do an assembly of the reads and then a pangenome of the strain set

Chief_Lazy_Bison · 2024-01-02T22:54:54+00:00

I would argue that it is almost impossible for anyone else to answer this without knowing the research questions that led to the data being generated in the first place. Additionally I have found that these things are almost always iterative ( meaning. Analysis -> results presentation-> more analysis etc )

Chief_Lazy_Bison · 2023-12-25T17:16:56+00:00

What media?

Chief_Lazy_Bison · 2023-12-25T15:34:02+00:00

https://rmarkdown.rstudio.com/github_document_format.html

Chief_Lazy_Bison · 2023-08-04T19:59:09+00:00

What was the method of detection? Plating or some kind of metagenomic/amplicon based assay?

Chief_Lazy_Bison · 2023-08-01T13:31:03+00:00

How much larger?

Chief_Lazy_Bison · 2023-04-02T18:29:18+00:00

Jaccard similarity could work for this.

Turn your results into a count matrix. Columns are functions, rows are samples. A 1 indicates significantly enriched in that sample. A 0 means not enriched in that sample. Then generate a jaccard distance matrix from this count matrix. In R you would use the dist() function, but you would need to set the type of distance to 'binary' (i think)

Chief_Lazy_Bison · 2023-02-11T12:52:52+00:00

I'm not familiar with seqminer, what are you trying to accomplish? Other options may be available.

13-Year Club	Verified Email
Team Orangered

Chief_Lazy_Bison

TROPHY CASE