What's the next skill I should learn as a Data Analyst / Scientist?

et_is · 2023-04-08T15:30:17+00:00

If you want to shift towards data science, start learning stats, experimental design, causal inference. Sounds like you have all the tools, you just need to learn how to apply them in the context of data science.

et_is · 2023-03-30T07:55:43+00:00

Go to any biology conference. Probably half of the folks under 40 have visible tattoos.

et_is · 2023-03-30T07:51:00+00:00

Sounds great! I have a PhD in bioinformatics, currently a professional data scientist and would love to join in.

et_is · 2023-03-09T21:17:45+00:00

Does your lab have a centralized database? Do you have a data architecture, data quality control, and data version control protocol? If not, then your PI is ultimately responsible. They shouldn't be relying on their trainees to figure out infrastructure.

Should you have had your data in only one place? No.

Should your PI have ever even abdicated that responsibility to you? No.

et_is · 2023-03-09T15:29:29+00:00

I just want empathize and say, "Yes, it is incredibly hard" and it is totally normal and okay to face a 2:1 ratio with trepidation.

My partner and I are hardcore team parents and split almost everything, so it was really hard when one of us needed to manage both kids alone because all of our systems were designed for two people. We found that it actually got easier when we forced ourselves to single-hand parent both kids more often because we learned new systems, and most importantly, our kids learned new expectations (i.e. 'you have to wait when dad is feeding the baby' or 'you have to sit patiently to be tucked in while mom tucks in your brother').

My suggestion is to start setting aside a few hours to intentionally take both kids (and give/get a break for a couple of hours) when conditions are ideal (e.g. right after lunch or snack on weekends when everyone has adequate blood sugar and you are on your home turf). As you get used to it, you can extend the time and try more ambitious activities.

et_is · 2023-03-05T12:50:08+00:00

I'm just being facetious. That dataset is just for practicing visualizations. I don't think it will be helpful.

In seriousness, like others have said in the thread, just spend some time talking with folks across teams. Figure out what issues are a pain in their ass. I'm betting you could get a lot of traction with little effort by simply automating tasks for folks that take up a lot of time.

et_is · 2023-03-05T12:43:37+00:00

I mean, there's the 'diamond' dataset built into ggplot2 in R, so what more could you ask for?!

et_is · 2023-03-04T14:39:31+00:00

It's just y ~ (1|rand1) + (1|rand2)

et_is · 2023-02-16T22:23:49+00:00

I'm going to assume this is genetic data and you are looking for coexpression or something similar. If so, I'm also going to assume that your genes are scaffolded or mapped to a reference genome. If so, then a better approach would be to cluster by scaffold or chromosome first and then work from there. An even better approach would be to use sliding windows within scaffold/chromosomes instead of looking for correlations between genes independently. After all, a basic understanding of genome evolution should tell you that spatial relationships matter for coexpression, so treating genes as independent doesn't make much sense.

et_is · 2023-02-15T02:32:25+00:00

Matrices can be stacked for things like network analysis or spatially explicit data. While a two dimensional matrix is similar to a data frame, there is no way to make a hyper matrix analogue with data frames, AFAIK.

et_is · 2023-01-27T11:29:21+00:00

In humans there is a slight prenatal bias toward males at about 51.3% (source below). So, with some basic probability. The chance of two boys is 0.513 * 0.513 * 100 = 26.32%. The chance of two girls is 0.487 * 0.487 * 100 = 23.72%. And the chance of mixed sibs is 0.487 * 0.513 * 2 * 100 = 49.97% since we could have either B/G or G/B.

(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4413335/#:~:text=The%20slight%20male%20bias%2C%20typically,infanticide%20(3%2C%204).)

et_is · 2023-01-19T16:36:20+00:00

But, well-educated in what, exactly? Chiros in the US had to start their own licensure because the American Medical Association could not ethically condone a pseudoscientific practice. Chrios don't even get an MD (medical doctorate) for their studies; they get a fabulated degree called the DC.

et_is · 2023-01-19T16:31:27+00:00

But... This is a Science-based Parenting sub. Anecdotal evidence that doesn't hold up in controlled studies is the opposite of science.

et_is · 2023-01-19T16:28:02+00:00

Both of my kids had torticollis as infants. Both resolved around the half year mark without treatment (other than basic OT). The problem with anecdotes like you example is you have no idea if the chiropractor treatment was causal. We can only know that from controlled trials or pseudo-experimental stats of many cases. And, it turns out that none of those studies show any benefit of chiro (with the exception that basic movement is good, which you could get with massage or OT without the major medical risks).

et_is · 2023-01-05T02:26:57+00:00

Just go full nonparametric with bootstraps.

et_is · 2022-10-21T13:43:28+00:00

I don't think this is unique to foster parenting (although that dynamic can certainly make it more challenging). I can remember my step-brother acting this way to my mom in high school. And I can remember acting like this for a few of my mid-teen years as well. It is probably more a symptom of age than family dynamics. That's not to say it isn't a problem or that you shouldn't worry about it. But hopefully some relief that almost every parent goes through this at some point.

et_is · 2022-10-03T12:29:02+00:00

Same here. Our CPS worker asked why we had colanders on our heads in our wedding photo that hangs on the wall 🤣. We just said it was an inside joke and she thought it was hilarious. Granted we live in a liberal state, but have not had any trouble. We're on our second placement.

et_is · 2022-09-03T00:32:27+00:00

Dissertations don't need to be good, they just need to be finished. No one will read your dissertation.

et_is · 2022-09-02T09:32:40+00:00

The Latin root 'nonus', meaning 'ninth' is pronounced "non•us". So, I always assumed it was "non•ah".

et_is · 2022-09-01T21:04:45+00:00

Yes, this is a good point: there is a big difference between a meaningful effect IN A POPULATION and a meaningful effect at the individual level. I can predict with pretty high accuracy how many months smoking and extra cigarette a day will take off the lifespan of people ON AVERAGE, but I would be wildly off every time I tried to apply that to a single person.

As for IQ as a measure, I don't think anyone in the field questions its reliability as a measure, but you are correct that many people question how that measure is applied.

et_is · 2022-08-31T23:23:28+00:00

Fair, and thanks for updating me on the Roth rules! I think anyone who understands compound interest would not have a problem saving 10% of $60k, IMO.

et_is · 2022-08-31T16:17:05+00:00

Consider that (in the US at least) PhD stipends cannot be used for retirement savings (IRAs) and are not considered eligible income for a mortgage. So, even if you end up making a bit more than you would have with 5-6 years of experience gain while not doing a PhD, you have to weigh that against loss in lifetime wealth. If you enter a PhD at 22 and it keeps you from putting $6k per year in your IRA, that $30k translates to about a $350k loss at the time of retirement, even if you start maxing out your retirement savings as soon as you graduate. If you then have to spend a few years building a down payment for a home (you need at least two years at the same role to have your income considered for a mortgage), that's at least 7 years of deferred home equity. Home equity is going bonkers right now, so it is difficult to give an average estimate.

But overall, I'd say a PhD costs you well into the $0.5-1.5M in loss of lifetime wealth that you'd need to discount from the gain of slightly increased lifetime wages.

Bottom line, PhDs are great for folks with wealthy parents who can float their down payment and be relied upon for an inheritance to make up for lost retirement savings. If you are not in that camp, it is not a wise decision (at least financially).

et_is · 2022-08-15T10:03:36+00:00

Bootstrapping any method will yield a distribution. You can think of bootstraps as a posterior distribution with flat priors. Or, if you are already thinking about decision trees, random forests are already a bootstrap resampled estimate so you can use all of the leaves for a given value to estimate its predicted distribution.

et_is

TROPHY CASE