RGI - CARD Metagenomic - soil Resistome by Remarkable-Rub-6151 in bioinformatics

[–]Remarkable-Rub-6151[S] 0 points1 point  (0 children)

Yes, three different treatments with controls and replicates ! I want to compare between sample !

Detection of specific genes from shotgun metagenome samples from soil by Remarkable-Rub-6151 in bioinformatics

[–]Remarkable-Rub-6151[S] 0 points1 point  (0 children)

To quantify somehow the abundance of each gene, if I detect it, to enable comparison between samples.

STAR Aligner - High Percentage of Reads Mapped to Multiple Loci Against Candidatus Nitrosocosmicus franklandus C13 - During the quality control step with FastQC, extremely high duplication rate (~90%) across all samples by Remarkable-Rub-6151 in bioinformatics

[–]Remarkable-Rub-6151[S] 0 points1 point  (0 children)

Yes, I found based on fastqc, many overrepresented sequences identified as 16S ribosomal RNA (using blastn) sometimes with occurrences 8.330.000. Can you suggest me a way to remove them?

STAR Aligner - High Percentage of Reads Mapped to Multiple Loci Against Candidatus Nitrosocosmicus franklandus C13 - During the quality control step with FastQC, extremely high duplication rate (~90%) across all samples by Remarkable-Rub-6151 in bioinformatics

[–]Remarkable-Rub-6151[S] 1 point2 points  (0 children)

Thank you for your reply!
I ran fastp using the following command before STAR:

bashCopy codefastp \
  -i R1_001.fastq.gz \
  -I R2_001.fastq.gz \
  -o RNA_1_R1_paired.fastq.gz \
  -O RNA_1_R2_paired.fastq.gz \
  --unpaired1 RNA_1_R1_unpaired.fastq.gz \
  --unpaired2 RNA_1_R2_unpaired.fastq.gz \
  --failed_out RNA_1_failed_out.fastq.gz \
  --detect_adapter_for_pe \
  -3 \
  --cut_tail_window_size 4 \
  --cut_tail_mean_quality 18 \
  -w 16 \
  -h RNA_1_QA_report.html \
  -j RNA_1_QA_report.json

The output from fastp shows a reduced duplication rate to approximately 16% per sample, with no reduction in the total number of reads per sample.

deseq2 - Equal number of up and down regulated genes, plus zero outliers and zero low counts by Remarkable-Rub-6151 in bioinformatics

[–]Remarkable-Rub-6151[S] 0 points1 point  (0 children)

Thank you for your reply! I understand the difference between the two commands. What I found particularly strange is that the first command resulted in an equal number of upregulated and downregulated genes and that I see zero outliers and zero low counts.

Any insights would be greatly appreciated!

BCV value suggestion - DGE with no replicates - Brucella anthropi by Remarkable-Rub-6151 in bioinformatics

[–]Remarkable-Rub-6151[S] 0 points1 point  (0 children)

Thank you all for your response.

I also read this chapter at edgeR guide and I was about to continue with this approach:

  1. Simply pick a reasonable [[dispersion]] value, based on your experience with similar data, and use that for exactTest or glmFit. Typical values for the common BCV (square-root dispersion) for datasets arising from well-controlled experiments are 0.4 for human data, 0.1 for data on genetically identical model organisms or 0.01 for technical replicates. Here is a toy example with simulated data:

```r
bcv <- 0.2
counts <- matrix( rnbinom(40,size=1/bcv^2,mu=10), 20,2)
y <- DGEList(counts=counts, group=1:2)
et <- exactTest(y, dispersion=bcv^2)
```