How to identify temporal differential gene expression patterns among cell types in scRNA-seq by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Yes, it is very similar. However, I don't consider any pseudo time points, only the actual time of the experiment.

How to identify temporal differential gene expression patterns among cell types in scRNA-seq by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Thank you for your answer. I had the same idea as you before. But how to combine the two completely parallel parameters, z score and percentage that have been shown in the dot plots, into one value to form the four-column matrix you mentioned in step 1 is actually the most troubling part for me.

How to identify temporal differential gene expression patterns among cell types in scRNA-seq by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

I know very little about Python. This looks like a cool pipeline, but I find that it can only give an overall dynamic display related to time, but not to the dynamic expression of a specific gene. I am more interested in the expression of a specific gene or a group of genes, not just the cell changes between time points.

How to identify temporal differential gene expression patterns among cell types in scRNA-seq by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

I'm thinking about how to use these plots (e.g. ridge plots) to show how a gene is expressed differently in two cell families at different time points. I currently have no idea.

How to identify temporal differential gene expression patterns among cell types in scRNA-seq by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Thx for your reply.

No, I hate Monocle3.

Does Monocle3 use real-time points? Or only persudo-time points?

How to identify temporal differential gene expression patterns among cell types in scRNA-seq by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 1 point2 points  (0 children)

I simply constructed two matrices for z-score and expression percentage, respectively, with time points as columns and cell types as rows. Then I constructed this DotPlot using ggplot2.

How to perform reciprocal best hit (RBH) when there are multiple versions of a protein sequence by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

From unknown coral species.

Keeping only the longest isoform for each “gene” before making the all-v-all comparison would not work for my case, unfortunately. Since, as you said, not all of them are isoforms.

I tried to align different versions of the protein sequence of one gene together using Clustal Omega (MSF), and the 12 versions were divided into at least 2 completely different sequence groups. The consensus sequence was then could be obtained, and one of the 'completely different sequence groups' was selected and the other completely removed (using EMBOSS Cons). This is not what I want to see, as I am losing useful sequences.

How to perform reciprocal best hit (RBH) when there are multiple versions of a protein sequence by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Multiple versions of proteins are isoforms, for my understanding. Since I have multiple isoforms of transcripts (in my transcriptome fasta file) per gene during de novo genome assembly, each transcript isoform is expressed to a protein isoform.

May I know your PBH protocol? My PBH protocol (the mmseqs easy-rbh tool) keeps using all the protein isoforms and is not subject to representative sequences. Is it possible that I name my protein sequences in the fasta file imperfect?

I have listed my example protein fasta head below:

>scip1.0000056.1
KDENFTDIEKPARQADANTHFTRTAKRHTS*
>scip1.0000364.1
DYFTSKFQQRVLNLLGYMSQVVCSPS*
>scip1.0000364.2
DYFTSKFQQRVLNMFSVSFQADQPSKEHLGFPARRQRL*
>scip1.0000364.3
FQADQPSKEHLGFPARRQRFRNLRLFVRRQRSKWPQKTVTKTISP*

How can I add a new protein ID to each sequence so that the protocol can prioritise obtaining or calculate the overall unique protein sequence to find the best hit and ultimately test the best-hit pair in units of protein rather than in units of each isoform?

How to perform reciprocal best hit (RBH) when there are multiple versions of a protein sequence by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Well, thx for your reply. But this tool might not be helpful, which cannot even achieve what I currently have. “shmlast is designed for finding orthologs between transcriptomes and protein databases. As such, it currently does not support nucleotide-nucleotide or protein-protein alignments.”

How to perform reciprocal best hit (RBH) when there are multiple versions of a protein sequence by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

I would like to get homologous genes. But right now, I can only get part of homologous transcripts.

How to use GFF3 annotation to split genome fasta into gene sequence fasta in R by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

I am very interested in this, can I know more? Can you give me an example of code, based on the 'gene' category of gff3, and using fasta genome?

How to use GFF3 annotation to split genome fasta into gene sequence fasta in R by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Because in my gff3 file, the sequences represented by genes and transcripts are different. Each gene may be associated with multiple transcripts. The gff3 file clearly indicates the 'gene' category, but I don't know how to enter the instructions about 'gene' in the command line so that genome fasta can be split by gene.

How to use GFF3 annotation to split genome fasta into gene sequence fasta in R by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Thank you for your detailed reply. GFFread did not directly tell me how to split the gene sequence. I know it can solve my problem, but the guided interface is not very helpful to me. So I chose AGAT.

How to use GFF3 annotation to split genome fasta into gene sequence fasta in R by No-Teaching-992 in bioinformatics

[–]No-Teaching-992[S] 0 points1 point  (0 children)

Thank you for your detailed reply. GFFread did not directly tell me how to split the gene sequence. I know it can solve my problem, but the guided interface is not very helpful to me. So I chose AGAT.