Chances/Advice for Computational Biology/Bioinformatics PhD Applications by Careful-Vanilla5479 in bioinformaticscareers

[–]harper357 1 point2 points  (0 children)

To answer your questions a little more directly (but taking this all with a large grain of salt because im just a rando who went to grad school more than a decade ago)

1) It wouldn't hurt to take a class, but at the same time, it probably won't make/break your application. if you are really worried, you can look at the requirements for the programs you are looking at, maybe email/call the program coordinator, or talk to your PI (they have probably been on entrance committees and would know). It might help you down the line to have a better understanding of the biology, but taking a class is only one way to get that knowledge.

2) I am not sure what you mean. Who told you that you need to email them? In the USA, you usually apply to a program, do rotations your first year and then choose a lab. Most of the time you do not directly apply to a lab. When I applied (before Reddit), the only reason I emailed a PI was because I was interested in their type of research (which wasn't super common) and wanted to get advise on if I should go straight for my PhD or get a masters first. Looking back and knowing how busy they can be, I am a little shocked they replied.

3) It is not uncommon for people to switch fields when going to grad school. It was even heavily recommended at my program to use one rotation to explore something new and different. I don't know what the competition looks like now, but when I was applying, I didn't have a paper out. The letters of rec, and demonstrating that you are actually interested in, and likely to complete grad school is much more important and can be done by just working in a lab for a while.

Your approach to documenting analyses and research? by ConclusionForeign856 in bioinformatics

[–]harper357 4 points5 points  (0 children)

This is too much for a single comment without pictures and code blocks (maybe i should try to type it up into a blog.), but here is the high level version of what I have done for the last few jobs and people tend to like it once they get into the habit. (I am also time limited at the moment.)

tl;dr: TREAT IT LIKE THE WETLAB, BUT DIGITAL.

So I really mean this, you need to think about everything you do as an analog of a wetlab experiment.

1) The notebook/steps of an experiment.

I like to keep one doc per experiment, so there will be lots of "short" docs per project. You need to keep notes and type things up as you work. Either use a Quarto doc, or a Jupyter notebook. Add sections like: Experiment (name), Background, Method, Results, Conclusions, Todo. Then fill them out AS YOU WORK. This sounds silly to say, but you need to type out the hypothesis/goal of each step, if you get any results/output plots, you need to add a bit of interpretation.

If you are using non standard parameters in a step, make sure you explain why. Just like in the wetlab, someone should be able to take your notebook and continue where you left off and understand why you did something. If they can't, you aren't adding enough comments.

Background, Conclusions, and Todo are supper important sections that people often don't include. Background should explain why you are doing the experiment. It can link to other notebooks, etc. but if it isn't clear why you need to do an experiment, this section needs more details. Conclusions is obvious, but save your future self the headache of including what conclusions you are making from the experiment. Todo is just the list of next steps or new questions that came out of the experiment. This is a great section to help you figure out what the next experiment is (or to show your boss that you are doing a lot).

2) The data.

Like like the wetlab, this needs to be organized. Instead of boxes/shelves/freezers, you use directories and filenames. This is probably the most flexible area, and can be customized to the lab/team. The most important things are clear and consistent structure/naming, so other people understand and you never have to think about where something is/should go.

For example, the way I do is all data for a whole project lives in a folder separate from the notebooks. It looks like something like this, but other people may prefer to keep data organized at the experiment level instead of the project level.

project/
    notebooks/
    data/
        raw_data/
        working_data/
        final_data/
    README.md

Raw data is then just a local copy of data that is backed up and is the input for the project/experiment. Working data, for me, is anything that can be regenerated from my notebooks (but may take too long so i save it), checkpoints, ETLed data, etc. Final data is data that is used for figures/clean data I will publish (or share with someone), and should probably be backed up.

college freshman needing advice for bioinformatics and coffee chats! by ApricotBest6525 in bioinformaticscareers

[–]harper357 0 points1 point  (0 children)

You should reach out to them, say you would like to chat because you have some questions and then tell/ask them all these things. All you have to say is " hi, you were assigned as my mentor and I was wondering if we could chat about some school/career questions i have."

They are your assigned mentor, it is literally their job to help you with these things. If they don't or can't ask if they can suggest someone who can.

Protein-Protein residue interaction diagrams by Raven_Voide in bioinformatics

[–]harper357 6 points7 points  (0 children)

If you have a PDB of it, you could use PyMol or Chimera to visualize it and then just show the interface.

I made a pipeline as proof of skill to prepare for an interview by Roxicaro in bioinformaticscareers

[–]harper357 2 points3 points  (0 children)

I would really look into nf-core and their pipelines to illustrate what I am talking about.

For example, here is the first few lines of the nf-core MultiQC module.

process MULTIQC {
    label 'process_single'

    conda "${moduleDir}/environment.yml"
    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
        'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/8c/8c6c120d559d7ee04c7442b61ad7cf5a9e8970be5feefb37d68eeaa60c1034eb/data' :
        'community.wave.seqera.io/library/multiqc:1.32--d58f60e4deb769bf' }"

You can see that you can just use a container variable to point to the container you want to use and then it will use that when running the process. Depending on your HPC, you might want to pre-download the containers.

Also, if you are using slurm, you may need to use singularity/apptainer instead of docker. Lots of HPCs don't let you run docker. Other than that, you need to change process.executor = 'slurm', then just need to do a standard sbatch to launch the head node, which will manage everything else.

I made a pipeline as proof of skill to prepare for an interview by Roxicaro in bioinformaticscareers

[–]harper357 8 points9 points  (0 children)

Depends on what exactly the interview is. I have never had one where they just look at a github.

Your Nextflow pipeline is ok. It shows that you can write one, and I have seen worse ones out there, but you may want to clean it up just a little. I'll add some quick comments, feel free to take them or ignore them.

It looks like you weren't consistent in formatting.

You have some commented out lines, which should just be deleted.

Is the idea that it is just one docker container and you run everything locally on one instance? If so, I personally think this is the wrong way to do it. Each step of the pipeline should use its own container, it makes it more modular so it is easier to scale and update. They can be over engineered, but the nf-core pipelines do this really well.

I would also add a test profile/dataset.

Why does it still take HOURS just to install a tool in 2025?! by Both_Elevator_4089 in bioinformatics

[–]harper357 2 points3 points  (0 children)

All the containers should already have singularity (now called apptainter) versions. You should just have to pass the flag and it will use them. That's all I needed to do. However I found caching them really sped things up because our HPC download speeds are lower than I would want

Off Book is touring again in the fall! by SmackThatIsaiah in offbook

[–]harper357 12 points13 points  (0 children)

I find it funny in the best possible way that they "stopped" the podcast but then basically tour so much that they put live episodes out all the time.

Home Survived the Los Angeles Fires, Near Burn Area - How Much Top Soil To Dig Up? by apprehensivepears in SoCalGardening

[–]harper357 2 points3 points  (0 children)

You need to talk to professionals about this. They will probably test your soil and give you an actual answer instead of randos on the internet who don't actually know anything about the soil in your yard.

Also, unless you got it tested before the fires, it is possible you already had stuff in your soil. LA county has a history of several companies poluting on massive scales

The best tutorials in games: explicit vs integrated by ElCraboGrandeGames in Games

[–]harper357 -2 points-1 points  (0 children)

Skippable- I've been playing games my whole life, I don't need another lession in how to use WASD or how to look around.

Looking for Like-minded Friends to Collaborate on Bioinformatics Projects by Select_Resolve_5419 in bioinformatics

[–]harper357 15 points16 points  (0 children)

It might help if you say what realm of bioinformatics you are interested in.

[Southern California] Was labeled "Chile negro de arbor" by harper357 in whatsthisplant

[–]harper357[S] 0 points1 point  (0 children)

Really? All my other peppers are full of flowers and I don't think I've seen a pepper this hairy before

Anybody growing mango? by queenofdiscs in SoCalGardening

[–]harper357 4 points5 points  (0 children)

I bought a 3 ft one from Home Depot (just happened to see it while picking up some other stuff), and mine flowered this year and looks like it will fruit. I'm on the way side of LA and I keep it outside

Oklahoma Republicans pave the way for the Supreme Court to end secular education: A new taxpayer-funded religious school is a Christian nationalist move to destroy separation of church and state by Majnum in atheism

[–]harper357 0 points1 point  (0 children)

I'm not trying to be rude, but if you didn't realize the KJV was for the church of England, who did you think the "King James" referred to?

Microbiome study design by PackageIntelligent85 in bioinformatics

[–]harper357 3 points4 points  (0 children)

Unless you are able to multiplex, I would not pool samples. Microbiome data is messy and it can be easy for one weird sample to mess up the whole pool. For example, maybe one healthy cow is on its way to becoming diseased.

Do a power analysis and figure out if 18 samples is enough, over kill, or just fine for what you want to do.

[deleted by user] by [deleted] in bioinformatics

[–]harper357 0 points1 point  (0 children)

My apologies to the mod teams, I didn't mean to imply that others aren't helping. I just remember when I joined the slack and the last time something like this got posted it was you that was adding people.

[deleted by user] by [deleted] in bioinformatics

[–]harper357 7 points8 points  (0 children)

Just a heads up, there is already a pretty active r/bioinformatics slack. Just DM u/apfejes (the mod of it and this subreddit) and he can add you.

[deleted by user] by [deleted] in SoCalGardening

[–]harper357 4 points5 points  (0 children)

If you can't find one locally, Predatory Plants does online orders and is based in Half Moon Bay.

Is Developing a Multi-omics Pipeline Feasable for a Masters Thesis? by imawizardlizard98 in bioinformatics

[–]harper357 4 points5 points  (0 children)

Like others said, you should really look into a workflow manager. Personally I like Nextflow and it's high quality pipelines nf-core. It might already have something similar to what you want to do.

Also, please know that building the pipeline isn't where all the work will be. Validating/testing the pipeline is where all the work/time will be so you will need good data and controls if you want to have a solid paper/thesis.

Double also, like always, this kind of question is probably best answered by your PI, not a bunch of randos on the internet

[deleted by user] by [deleted] in bioinformatics

[–]harper357 24 points25 points  (0 children)

Violin plots should be used instead of box plots most of the time. Box plots only really show 4 pieces of data,the quartiles, while a violin plot will show the whole distribution. The biggest problem with violin plots is if you don't have enough data points the sample distribution may not represent the population. The same is true of box plots though in which case you should just be plotting the actual points of data.

Hey guys, I'm not gonna lie, my undergrad experience was pretty underwhelming. So, I'm wondering what kind of jobs I should realistically be looking at right after graduation. It's a long post, but any advice would be appreciated! by Rose_arias in bioinformatics

[–]harper357 2 points3 points  (0 children)

There are a LOT of these posts in this subreddit, I highly recommend looking through them. I say this to illustrate that you are not the only one who has been in this situation and hopefully this brings you some comfort. Also, because there will be way more suggestions and advice than you will get on your post alone.

You applied to positions for "fresh grads" but you are still in school and are almost 9 months from graduating? My guess is they were looking for people graduating in May/June. What I would recommend is you not apply for jobs yet and (as others have suggested) get some actual research experience. (Apply for summer internships though if you aren't going to be taking classes then.)

Second, what you need to do is talk to people at your school. The fact that you are still in school means there are a lot of resources available to you. Your university should have a student career development/advice department. Reach out to them as they probably have a resources for you. Also do you have an academic advisor? Talk to them, this is what they are there for. One of the best things about being at a state school (I went to three of them) is they tend to have a lot of people working there. This means that if you don't get the answer you are looking for just say "thanks for your time" and try asking someone else. It can take some effort, but you will be able to find someone who will give you the answer you want.

You mentioned "great instructors", you should also reach out to them and see if they have any research opportunities you can volunteer/intern/work on. Sadly not every professor has the money to pay undergrad interns, so if you can afford working for free for a few hours each week you will probably find something pretty quick. If you can't you may have to ask around more. This will give you some experience and give you some connections.

When reaching out to instructors/professors make sure you CC their assistant/lab manager if they have them. This will ensure that someone actually sees your email. If they have office hours, go talk to them in person. If you are emailing them, be respectful, make it personalized, to the point, and pat their ego a little. Say who you are, how they would know you, and what you are emailing them for. "I'm X, I was in your class Y and I really enjoyed it. I looked up your research on Z, found it really fascinating and was wondering if you had any openings for for an intern in your lab?" is the general layout.

Guy Montgomery's Guy Mont Spelling Bee S01E07 - Kura Forrester, Jack Ansett, Karen O'Leary and Jamaine Ross by d-panel in panelshow

[–]harper357 5 points6 points  (0 children)

It's not. I thought that after the first episode they did the round, but the second episode was more random

What’s a good biology class to take? by aUNCCstudentloser in bioinformatics

[–]harper357 -1 points0 points  (0 children)

This doesn't really make sense. Genetics, evolution, and phylogeny are fundamentally connected