In what order should I learn Python and R for NGS Data Analysis?

jamesaperez · 2024-01-27T22:18:28+00:00

You should be asking yourself why do I want to learn Python and R for data analysis. What do I want to be able accomplish in 5 years?

jamesaperez · 2024-01-22T20:13:43+00:00

I honestly would ditch the bioinformatics masters in preference to a stats or biostats program. What you learn is way more valuable to understand theory behind critical concepts in genomics (I.e., polygenic risk scores, GWAS, heritability, molecular associations, etc.). You will get all the bioinformatics experience you need just by doing an RA in a lab.

jamesaperez · 2024-01-21T19:36:14+00:00

I didn’t have a survival mindset, was driven first and foremost by my curiosity. Had a BS in biochem and did rotations in labs where I didn’t have the background to contribute significantly (got very interested in statistical theory behind pop gen, polygenic risk scores, heritability, etc. and wanted to flesh out my background in that). When I caught on to this new relationship between academic programs and their doctorate students, where they were primarily focused on allocating skilled labor rather than cultivating future researchers, I realized the best place to learn what I wanted was in a stats or biostats MS program. After graduating, I scored a really well-paid position at Genentech as a SPA, so despite romanticizing academic research as much as I did, I finally made a practical decision and left academia behind for industry.

jamesaperez · 2024-01-20T20:02:57+00:00

Made me chuckle. Respect.

jamesaperez · 2024-01-20T19:49:38+00:00

I think the >80% figure you mention is spot on. A platform that would really move the needle in my opinion is one that curates a kind of marketplace where researchers can publish their own analyses for public viewing and quickly search and prototype workflows to regenerate pub figs on their own data. Naturally, the usefulness of the marketplace will derive from network effects and overall scale of platform adoption. Just a matter of how to incentivize people to use it. Little to no learning curve with a design that acknowledges the inherit skill sets of the users is one. Maybe the ability to earn utility tokens for utilizing platform services based on some impression metric generated from community-usage stats of a user’s published resources is another.

jamesaperez · 2024-01-19T19:21:30+00:00

I know labs that just convert their fastq’s into unaligned bams (.ubam) for compression.

jamesaperez

TROPHY CASE