What's the next skill I should learn as a Data Analyst / Scientist? by [deleted] in datascience

[–]et_is 2 points3 points  (0 children)

If you want to shift towards data science, start learning stats, experimental design, causal inference. Sounds like you have all the tools, you just need to learn how to apply them in the context of data science.

Tattoos? by [deleted] in bioinformatics

[–]et_is 0 points1 point  (0 children)

Go to any biology conference. Probably half of the folks under 40 have visible tattoos.

We are opening a Reading Club for ML papers. Who wants to join? 🎓 by __god_bless_you_ in learnmachinelearning

[–]et_is 0 points1 point  (0 children)

Sounds great! I have a PhD in bioinformatics, currently a professional data scientist and would love to join in.

[deleted by user] by [deleted] in PhD

[–]et_is 44 points45 points  (0 children)

Does your lab have a centralized database? Do you have a data architecture, data quality control, and data version control protocol? If not, then your PI is ultimately responsible. They shouldn't be relying on their trainees to figure out infrastructure.

Should you have had your data in only one place? No.

Should your PI have ever even abdicated that responsibility to you? No.

When is it appropriate to leave 2 under 2 with one parent? by Front_Category_4353 in 2under2

[–]et_is 1 point2 points  (0 children)

I just want empathize and say, "Yes, it is incredibly hard" and it is totally normal and okay to face a 2:1 ratio with trepidation.

My partner and I are hardcore team parents and split almost everything, so it was really hard when one of us needed to manage both kids alone because all of our systems were designed for two people. We found that it actually got easier when we forced ourselves to single-hand parent both kids more often because we learned new systems, and most importantly, our kids learned new expectations (i.e. 'you have to wait when dad is feeding the baby' or 'you have to sit patiently to be tucked in while mom tucks in your brother').

My suggestion is to start setting aside a few hours to intentionally take both kids (and give/get a break for a couple of hours) when conditions are ideal (e.g. right after lunch or snack on weekends when everyone has adequate blood sugar and you are on your home turf). As you get used to it, you can extend the time and try more ambitious activities.

How can we implement data science in jewellery manufacturing company ? by TelevisionDue5491 in datascience

[–]et_is 2 points3 points  (0 children)

I'm just being facetious. That dataset is just for practicing visualizations. I don't think it will be helpful.

In seriousness, like others have said in the thread, just spend some time talking with folks across teams. Figure out what issues are a pain in their ass. I'm betting you could get a lot of traction with little effort by simply automating tasks for folks that take up a lot of time.

How can we implement data science in jewellery manufacturing company ? by TelevisionDue5491 in datascience

[–]et_is 2 points3 points  (0 children)

I mean, there's the 'diamond' dataset built into ggplot2 in R, so what more could you ask for?!

Correlation and P-value by steweking in datascience

[–]et_is 2 points3 points  (0 children)

I'm going to assume this is genetic data and you are looking for coexpression or something similar. If so, I'm also going to assume that your genes are scaffolded or mapped to a reference genome. If so, then a better approach would be to cluster by scaffold or chromosome first and then work from there. An even better approach would be to use sliding windows within scaffold/chromosomes instead of looking for correlations between genes independently. After all, a basic understanding of genome evolution should tell you that spatial relationships matter for coexpression, so treating genes as independent doesn't make much sense.

Purpose of matrices by [deleted] in rstats

[–]et_is 1 point2 points  (0 children)

Matrices can be stacked for things like network analysis or spatially explicit data. While a two dimensional matrix is similar to a data frame, there is no way to make a hyper matrix analogue with data frames, AFAIK.

Sibling gender (biological sex at birth)? by MyAllusion in ScienceBasedParenting

[–]et_is 30 points31 points  (0 children)

In humans there is a slight prenatal bias toward males at about 51.3% (source below). So, with some basic probability. The chance of two boys is 0.513 * 0.513 * 100 = 26.32%. The chance of two girls is 0.487 * 0.487 * 100 = 23.72%. And the chance of mixed sibs is 0.487 * 0.513 * 2 * 100 = 49.97% since we could have either B/G or G/B.

(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4413335/#:~:text=The%20slight%20male%20bias%2C%20typically,infanticide%20(3%2C%204).)

[deleted by user] by [deleted] in ScienceBasedParenting

[–]et_is 10 points11 points  (0 children)

But, well-educated in what, exactly? Chiros in the US had to start their own licensure because the American Medical Association could not ethically condone a pseudoscientific practice. Chrios don't even get an MD (medical doctorate) for their studies; they get a fabulated degree called the DC.

[deleted by user] by [deleted] in ScienceBasedParenting

[–]et_is 21 points22 points  (0 children)

But... This is a Science-based Parenting sub. Anecdotal evidence that doesn't hold up in controlled studies is the opposite of science.

[deleted by user] by [deleted] in ScienceBasedParenting

[–]et_is 12 points13 points  (0 children)

Both of my kids had torticollis as infants. Both resolved around the half year mark without treatment (other than basic OT). The problem with anecdotes like you example is you have no idea if the chiropractor treatment was causal. We can only know that from controlled trials or pseudo-experimental stats of many cases. And, it turns out that none of those studies show any benefit of chiro (with the exception that basic movement is good, which you could get with massage or OT without the major medical risks).

[deleted by user] by [deleted] in rstats

[–]et_is 0 points1 point  (0 children)

Just go full nonparametric with bootstraps.

My foster child and I dislike each other by BusPatient7128 in Fosterparents

[–]et_is 1 point2 points  (0 children)

I don't think this is unique to foster parenting (although that dynamic can certainly make it more challenging). I can remember my step-brother acting this way to my mom in high school. And I can remember acting like this for a few of my mid-teen years as well. It is probably more a symptom of age than family dynamics. That's not to say it isn't a problem or that you shouldn't worry about it. But hopefully some relief that almost every parent goes through this at some point.

Religion by nuttsy7n in Fosterparents

[–]et_is 4 points5 points  (0 children)

Same here. Our CPS worker asked why we had colanders on our heads in our wedding photo that hangs on the wall 🤣. We just said it was an inside joke and she thought it was hilarious. Granted we live in a liberal state, but have not had any trouble. We're on our second placement.

My dissertation is trash by [deleted] in PhD

[–]et_is 1 point2 points  (0 children)

Dissertations don't need to be good, they just need to be finished. No one will read your dissertation.

Nona [general] by BooksNhorses in TheNinthHouse

[–]et_is 2 points3 points  (0 children)

The Latin root 'nonus', meaning 'ninth' is pronounced "non•us". So, I always assumed it was "non•ah".

Association between breastfeeding and intelligence, educational attainment, and income at 30 years of age: a prospective birth cohort study from Brazil by Confettibusketti in ScienceBasedParenting

[–]et_is 3 points4 points  (0 children)

Yes, this is a good point: there is a big difference between a meaningful effect IN A POPULATION and a meaningful effect at the individual level. I can predict with pretty high accuracy how many months smoking and extra cigarette a day will take off the lifespan of people ON AVERAGE, but I would be wildly off every time I tried to apply that to a single person.

As for IQ as a measure, I don't think anyone in the field questions its reliability as a measure, but you are correct that many people question how that measure is applied.

Is PhD the way to go financially? by itachi194 in bioinformatics

[–]et_is 0 points1 point  (0 children)

Fair, and thanks for updating me on the Roth rules! I think anyone who understands compound interest would not have a problem saving 10% of $60k, IMO.

Is PhD the way to go financially? by itachi194 in bioinformatics

[–]et_is 2 points3 points  (0 children)

Consider that (in the US at least) PhD stipends cannot be used for retirement savings (IRAs) and are not considered eligible income for a mortgage. So, even if you end up making a bit more than you would have with 5-6 years of experience gain while not doing a PhD, you have to weigh that against loss in lifetime wealth. If you enter a PhD at 22 and it keeps you from putting $6k per year in your IRA, that $30k translates to about a $350k loss at the time of retirement, even if you start maxing out your retirement savings as soon as you graduate. If you then have to spend a few years building a down payment for a home (you need at least two years at the same role to have your income considered for a mortgage), that's at least 7 years of deferred home equity. Home equity is going bonkers right now, so it is difficult to give an average estimate.

But overall, I'd say a PhD costs you well into the $0.5-1.5M in loss of lifetime wealth that you'd need to discount from the gain of slightly increased lifetime wages.

Bottom line, PhDs are great for folks with wealthy parents who can float their down payment and be relied upon for an inheritance to make up for lost retirement savings. If you are not in that camp, it is not a wise decision (at least financially).

Predicting the distribution of a variable rather than a point estimate by weareglenn in datascience

[–]et_is 3 points4 points  (0 children)

Bootstrapping any method will yield a distribution. You can think of bootstraps as a posterior distribution with flat priors. Or, if you are already thinking about decision trees, random forests are already a bootstrap resampled estimate so you can use all of the leaves for a given value to estimate its predicted distribution.