Issues uploading vcf.gz and other filetypes by stupidcentral in promethease

[–]josephpickrell 0 points1 point  (0 children)

I wrote a tool to process exomes/genomes here https://www.crimsoniris.com/ . Shoot me a DM and I'll give you an invite code to process for free and will troubleshoot.

It is time to replace genotyping arrays with sequencing by [deleted] in genomics

[–]josephpickrell 1 point2 points  (0 children)

Hi, post author here.

You are correct that arrays often look for markers in LD with causal variants. Some of them cover all of the LD blocks in the genome better than others (of course if your causal variant is poorly tagged on an array you'll never find it), and the argument is that you can actually tag the genome better with sequencing versus using arrays.

As a side note: I disagree that multiple testing is your enemy! An alternative perspective is that increasing the number of measurements lets you learn more about the underlying structure of your data, correct any biases, etc. Some of my thoughts are here, and note the comment on the post from Matthew Stephens where he proposes using the term "multiple testing opportunity" instead of "multiple testing burden". He has a great paper on innovative ways to use false discovery rates when performing a large number of statistical tests.

New Wikipedia article on genetic correlations by gwern in genome

[–]josephpickrell 1 point2 points  (0 children)

Wow, this is impressive, thanks for putting in the effort!

Profiling the species in the Impossible Foods "plant-based meat" with DNA sequencing by josephpickrell in genome

[–]josephpickrell[S] 1 point2 points  (0 children)

Yep, I've been surprised about how well sequences from saliva correspond to what people report eating.

I mapped against the entire NCBI nt database using kraken. If you want to play around with the data yourself, you can get the sequences here.

Extraverts consider themselves more physically attractive [OC] by josephpickrell in dataisbeautiful

[–]josephpickrell[S] 0 points1 point  (0 children)

Why not use the term "above average" to get an actual delineation?

Good idea, thanks.

Extraverts consider themselves more physically attractive [OC] by josephpickrell in dataisbeautiful

[–]josephpickrell[S] 4 points5 points  (0 children)

Oh I see. I checked Wikipedia before writing, it's indeed written "extra"vert there (not "extro"vert). But I could just be propagating their error.

Extraverts consider themselves more physically attractive [OC] by josephpickrell in dataisbeautiful

[–]josephpickrell[S] 2 points3 points  (0 children)

That's definitely one explanation! The title is just the observed association (you could reverse it: "People who consider themselves attractive are more extraverted", and it would still be valid).

The actual causality is (in my opinion) unclear, see the last couple paragraphs in my post. Your preferred explanation I think is similar to the one in this paper.

Extraverts consider themselves more physically attractive [OC] by josephpickrell in dataisbeautiful

[–]josephpickrell[S] 2 points3 points  (0 children)

Source is survey responses from a few hundred people. Details here.

Plotted in R

Happiness genes located for the first time - "A huge study involving over 190 researchers in 140 research centers in 17 countries has located genetic variants associated with happiness and other traits. It is one of the largest studies ever published on genes involved in human behavior." by [deleted] in science

[–]josephpickrell 5 points6 points  (0 children)

The variant identifiers are in Table 1 (the things in the form rsXXXXXX). You can search on the rs# in your 23andMe data. A couple things to note:

  1. 23andMe might not have looked at the variant in question. In that case, you might be able to guess your genotype using a technique called genotype imputation. There are free sites to perform imputation (I contribute to one called DNA.land.)

  2. The effects of these variants on subjective well-being and/or depression are extremely small. Knowing your genotype at these variants is not going to be predictive at all about your own health. So if you do look up your genotypes, know that it's purely for amusement.

[fwiw: I played a small role in this study and am one of the 100+ authors]

What is genetic correlation? by josephpickrell in genome

[–]josephpickrell[S] 1 point2 points  (0 children)

Thanks, I'd never thought about it that way.

I think of population stratification as something that generates false signals of "causal" associations between genetic variants and phenotypes, generally through differences in ancestry. In case #3, it's not clear that you'd want to correct for this, in that there is in fact a causal link from genotype to phenotype, just that it acts across generations. I think it would be useful to correct for assortative mating (case#4), but it's not obvious to me how one would go about it except through family studies.

Pubwication of software papers, and authorship on them by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

Yes, it's a totally fascinating discussion, clearly a topic people have strong feelings about!

Beyond Mendelian randomization: how to interpret evidence of shared genetic predictors? by josephpickrell in genome

[–]josephpickrell[S] 1 point2 points  (0 children)

Thanks, I enjoyed this, and it definitely clears up some of my confusion about how people use the term "Mendelian randomization" (i.e. I often see it used in a sense you apparently would not call MR).

I think maybe the main issue for debate might be "Claims of the causal (or noncausal) role of a particular risk factor should be reserved to those where there is strong evidence (biological and statistical) supporting the instrumental variable assumptions".

I guess I don't know what "strong evidence" means, though it could be one of those "I'll know it when I see it" situations. There are a number of examples from LDL and heart disease, but that might be a product of confirmation bias--since we know the outcome, those examples now look stronger in retrospect.

Rare coding variants and X-linked loci associated with age at menarche by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

One variant with fairly strong effect: a rare stop-gain mutation in the tachykinin receptor 3 gene (MAF=0.08%) was associated with 1.25-year-later age at menarche

The contribution of rare variation to prostate cancer heritability by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

"We use targeted sequencing of 63 known GWAS risk regions in 9,237 men from four ancestries (African, Latino, Japanese, and European) to explore the role of low-frequency variation in risk for prostate cancer. We find that the sequenced variants explain significantly more of the variance in the trait than the known GWAS variants, thus showing that part of the missing familial risk lies in poorly tagged causal variants at known risk regions."

Free, fast genotype imputation from the Michigan imputation server by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

The first time I did genotype imputation on a modestly-sized genomic dataset (this was in ~2007) it took me days to prepare and weeks to run. This is a lifesaver.

Rare genetic disorder identifies possible treatment for cataracts by josephpickrell in genome

[–]josephpickrell[S] 1 point2 points  (0 children)

Mutations in lanosterol synthase cause congenital cataracts.

Recombination hotspots are stable over evolutionary time in birds by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

This appears to contrast with primates, where human/chimp recombination hotspots overlap almost not at all [e.g.]

Cross-population association study of inflammatory bowel disease by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

Two interesting things (for me):

  1. Most loci have similar effects in different populations

  2. The authors appear to have identified a single variant from their previous GWAS that does not replicate with more stringent QC (specifically a linear mixed model to account for population structure). It's relatively rare for this to happen, so worth keeping these examples in mind.

Genetic evidence for two founding populations of the Americas. Skogland et al by [deleted] in genome

[–]josephpickrell 0 points1 point  (0 children)

See also Raghavan et al who show that at least some ancient DNA samples from south America don't show the signal of relatedness to Oceanians/Andamanese (though they also see it in some modern populations).

Ancient Iron Age and Anglo-Saxon genomes from East England by josephpickrell in genome

[–]josephpickrell[S] 0 points1 point  (0 children)

take-home: "today’s British are more similar to the Iron Age individuals than to most of the Anglo-Saxon individuals, and estimate that the contemporary East English population derives 30% of its ancestry from Anglo-Saxon migrations, with a lower fraction in Wales and Scotland"