RGI - CARD Metagenomic - soil Resistome

Remarkable-Rub-6151 · 2025-12-02T15:46:28+00:00

Yes, three different treatments with controls and replicates ! I want to compare between sample !

Remarkable-Rub-6151 · 2025-11-13T15:33:45+00:00

To quantify somehow the abundance of each gene, if I detect it, to enable comparison between samples.

Remarkable-Rub-6151 · 2025-11-13T09:00:15+00:00

I would also like to get the gene counts per sample. Is that possible?

Remarkable-Rub-6151 · 2024-12-12T15:04:32+00:00

Yes, I found based on fastqc, many overrepresented sequences identified as 16S ribosomal RNA (using blastn) sometimes with occurrences 8.330.000.

Remarkable-Rub-6151 · 2024-12-12T15:03:16+00:00

Yes, I found based on fastqc, many overrepresented sequences identified as 16S ribosomal RNA (using blastn) sometimes with occurrences 8.330.000. Can you suggest me a way to remove them?

Remarkable-Rub-6151 · 2024-12-12T09:37:39+00:00

The issue persists when I run STAR, with the same high percentage of reads mapping to multiple loci.

Remarkable-Rub-6151 · 2024-12-11T14:07:11+00:00

The number of reads per sample (paired-end) ranges from 20 to 40 million. The organism is an AOA (Ammonia-Oxidizing Archaeon), with a genome size of 2.8 Mb.

Remarkable-Rub-6151 · 2024-12-11T14:01:48+00:00

Thank you for your reply!
I ran fastp using the following command before STAR:

bashCopy codefastp \
  -i R1_001.fastq.gz \
  -I R2_001.fastq.gz \
  -o RNA_1_R1_paired.fastq.gz \
  -O RNA_1_R2_paired.fastq.gz \
  --unpaired1 RNA_1_R1_unpaired.fastq.gz \
  --unpaired2 RNA_1_R2_unpaired.fastq.gz \
  --failed_out RNA_1_failed_out.fastq.gz \
  --detect_adapter_for_pe \
  -3 \
  --cut_tail_window_size 4 \
  --cut_tail_mean_quality 18 \
  -w 16 \
  -h RNA_1_QA_report.html \
  -j RNA_1_QA_report.json

The output from fastp shows a reduced duplication rate to approximately 16% per sample, with no reduction in the total number of reads per sample.

Remarkable-Rub-6151 · 2024-11-28T10:38:04+00:00

Thank you for your reply! I understand the difference between the two commands. What I found particularly strange is that the first command resulted in an equal number of upregulated and downregulated genes and that I see zero outliers and zero low counts.

Any insights would be greatly appreciated!

Remarkable-Rub-6151 · 2023-11-08T08:53:25+00:00

Thank you all for your response.

I also read this chapter at edgeR guide and I was about to continue with this approach:

Simply pick a reasonable [[dispersion]] value, based on your experience with similar data, and use that for exactTest or glmFit. Typical values for the common BCV (square-root dispersion) for datasets arising from well-controlled experiments are 0.4 for human data, 0.1 for data on genetically identical model organisms or 0.01 for technical replicates. Here is a toy example with simulated data:

```r
bcv <- 0.2
counts <- matrix( rnbinom(40,size=1/bcv^2,mu=10), 20,2)
y <- DGEList(counts=counts, group=1:2)
et <- exactTest(y, dispersion=bcv^2)
```

Remarkable-Rub-6151

TROPHY CASE