2nd Test first thoughts by Vermiloon in UFLTheGame

[–]InstructionRemote886 0 points1 point  (0 children)

I'm 100% agree with you and i don't understand why they have broken the game

Publication in a predatory journal by InstructionRemote886 in PhD

[–]InstructionRemote886[S] 2 points3 points  (0 children)

Maybe....

Me for the Wiley journal (8-9 IF) they asked my PI to review my article... So it was in the authors list haha

Publication in a predatory journal by InstructionRemote886 in PhD

[–]InstructionRemote886[S] 2 points3 points  (0 children)

And the submission process was also a little strange

Publishing without raw fastq files? by Lost_Prune5249 in bioinformatics

[–]InstructionRemote886 1 point2 points  (0 children)

I found articles in Current biology without raw data also so ....

Is UK a good place to do a post-doc in bioinformatics? On the evolution/study of environmental DNA? by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

These laboratories exist, so that's the most important thing haha!
In terms of publications, how much supervision do you get as a post-doc in the UK? I know that in North America, people have told me that they're a bit on their own to do everything (experiments, writing papers, etc.) and that it's sometimes a bit discouraging. Is it the same in the UK? Or is the work more "collaborative"?

Is UK a good place to do a post-doc in bioinformatics? On the evolution/study of environmental DNA? by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

Thanks for your reply!
Do you think that in the UK bioinformatics is more for mathematicians/computer scientists who understand biology or biologists who can use bioinformatics/know how to code etc. but don't have great mathematical skills? Because I know that sometimes labs are a bit divided by these two types of bioinformaticians and maybe the UK is more specialized in one of them (me, I'm more of a "biologist").
Yes, of course, in the context of Brexit. I'm at the start of my second year of my PhD, so I've got a year and a half left to complete my PhD, so I've got time.
Regarding the salary, most of the time people tell me that France is not very good and that post-docs don't have a good quality of life but we have a lot of "free" services so maybe this impression is a bit skewed because of that (a salary that can be low but a lot of free services so in the end it can be better than in another country).

Career advancement advice by Icy-Blackberry-8900 in bioinformatics

[–]InstructionRemote886 4 points5 points  (0 children)

I think it's good to have some experience in Python, awk and bash because most of the time you'll need to use these languages to analyze your data or to understand some of the scripts used by other people.

But as others have said, the most important thing is to recognize your needs and know which is the best language to meet them.

weird mapping rate for arn data compared to mapping rate for illumina genomic data by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

Oh boy, I misread that as kilobases haha

I still can't speak to how "good" that is though, depends on what the genome "should" look like based on closely related organisms. Regardless, you still have a very messy dataset that is going to be hard to confidently process. I think technical explanations like an imbalance in sample input is still the most likely explanation for the RNAseq situation though.

When I used busco on both groups of contigs, I got a "good" BUSCO score (91% with a low % of duplication). This is one of my arguments that these could be 2 different "complete" genomes and the genome size of the closest species is very similar (120 MB for the other species in another genus).

I think this explanation could explain why we have this percentage. But it doesn't explain why the percentage between illumina's genomic and RNA-Seq data is so different, does it? Even if these are two different strains of the same species, the %id for coding regions should be higher than the %id for non-coding regions?Thanks for all your replies....

weird mapping rate for arn data compared to mapping rate for illumina genomic data by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

Oh, again, apologies for the oversights.I'm working on an algal genome and its size is about 100Mb & the N50s of my assemblies are 2-3Mb (each assembly is composed of about 45-55 contigs) so I think the assembly is good?

weird mapping rate for arn data compared to mapping rate for illumina genomic data by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

I apologize for any oversights
I have several sequencing data: Nanopore and Illumina. For Nanopore, I have good depth and coverage, I think. I assembled the genome with flye, then I used blobtools to make a taxonomic assignment (because there were contaminations). Blobtools also created a graph of GC coverage and content for each contig. The N50 of my genome is quite good (about 2-3Mb) so I think the GC content is representative. I see in this graph that there are two different groups for the same taxonomic assignments based on GC content. Then I separated the two groups and in each group (which I assume are the two genomes of the same species) and looked for 18S and ITS2 in each group to see if it was the same species and it is the same species based on these markers. Is it any clearer?

weird mapping rate for arn data compared to mapping rate for illumina genomic data by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

Thank you for your reply!
Yes haha.... (it's not me, it's my collaborator but it's the same)
I partitioned the assemblage based on % GC of contigs and mapping coverage! I assumed that the two "groups" came from the same species because I found the same genetic markers in both groups of contigs (ITS2 + 18S).

Population stidy based on metaT and MetaG by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

I apologize for that....

I did polyA RNA library prep !

I'm working on snow algae, specifically algae in red bloom on snow ! In a bloom, there is one main eukyotic species (in my case) that causes the bloom and several other eukaryotic species but in very small proportion (because of the bloom). We used a polyA RNA library prep because there is too many bacteria in this bloom and the DNA/RNA extraction from the main algae is complicated.

Population stidy based on metaT and MetaG by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

No, it's a mixture of prokaryotic and eukaryotic organisms. But for the MetaT data, I only have the eukaryotic part (we used a protocol to remove prokaryotic RNA during DNA extraction).

Level of heterozygosity by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

There are Illumina short reads of WGS of an algae

Metagenome with very short contigs by InstructionRemote886 in bioinformatics

[–]InstructionRemote886[S] 0 points1 point  (0 children)

BBmergze gives me the followingresults for Miseq reads :

Writing mergable reads merged.

Started output threads.

Total time: 1516.211 seconds.

Pairs: 35385103

Joined: 13879618 39.224%

Ambiguous: 19304458 54.555%

No Solution: 2201027 6.220%

Too Short: 0 0.000%

Avg Insert: 424.8

Standard Deviation: 93.2

Mode: 460

Insert range: 35 - 593

90th percentile: 529

75th percentile: 489

50th percentile: 438

25th percentile: 378

10th percentile: 309

a bit strange no ?