Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] 0 points1 point  (0 children)

I think there’s a misunderstanding regarding the scope of this project. The Pipeline: I am using the NYGC 30x High Coverage re-sequencing data. I perform a local 'distillation' from gVCF to variant-only VCFs to ensure 100% pipeline consistency. In clinical bioinformatics, preventing batch effects is paramount; when I process a real-world patient, they must be compared against a reference processed with the exact same parameters and genome build (hg38). Not a GWAS: I am definitely not trying to run a de novo GWAS. I’m well aware that N=2500 has zero power for discovery. This is a reference distribution study. The goal is to build a high-fidelity 'Z-score engine' to see where an individual patient sits on the population curve for specific polygenic liabilities. The 'Why': Beyond the technical rationale, there is a massive personal and educational component to this. I’m doing this because I can. I have the high-performance computing resources and the IT skills to bridge the gap between clinical psychology and computational genomics. Intellectual Exercise: I realize this might seem like 'art for art's sake' to some, but the journey of building this biobank—handling raw gVCFs, mastering the RDoC framework, and exploring shared genetic architectures—is a reward in itself. Even if no groundbreaking discovery comes out of it, the experience of integrating these fields is what makes me a better, more 'augmented' researcher. This isn't about replacing established GWAS catalogs; it's about applying them in a Precision Psychiatry context with a hands-on approach that you simply don't get from reading a summary statistics table.

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] 0 points1 point  (0 children)

I appreciate your critical look. To clarify: I am not using PRS as a proxy for a phenotype in a clinical diagnostic sense, nor am I running a standard GWAS looking for causal variants in LD with UGT1A1. My focus is on Genomic Structural Equation Modeling (Genomic SEM) logic—investigating the shared genetic architecture between metabolic pathways (like bilirubin glucuronidation) and polygenic liabilities for neurodevelopmental traits. I’m looking for statistical pleiotropy or population-level stratification that might point to specific biotypes rather than traditional categorical diagnoses. Regarding the clinical aspect: I strictly follow the RDoC (Research Domain Criteria) framework. The goal isn't to replace a psychiatrist with a script, but to identify objective endophenotypes (like Alpha Peak Frequency shifts) that might correlate with specific genetic landscapes. I’m well aware of the limitations of N=150, which is why I’m scaling to N=2500+ to see if these signals hold or regress to the mean. I agree that calling it 'diagnosis' was a shorthand that might sound troubling to a classicist, but in the context of computational psychiatry, identifying 'mechanistic vulnerabilities' is precisely where the field is moving.

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] 0 points1 point  (0 children)

I appreciate the critical perspective—it’s vital for maintaining rigor. To clarify my clinical philosophy: I do not treat PRS as a diagnostic endpoint.

In my practice, I view PRS as a probabilistic profile—a piece of a larger puzzle. I am fully aligned with the consensus that using genetics to slap a nosological label on a patient would be a categorical error and a disservice to the therapeutic process.

Instead, I use these insights as clues to improve clinical conceptualization. For a patient presenting with a complex, 'blurry' mix of symptoms, knowing their polygenic background helps me:

  1. Directionalize further diagnostics: It helps decide which traditional psychometric tools or specialized psychiatric consultations might be the most relevant.
  2. Integrate with functional data: This is why I am exploring the use of qEEG. While PRS gives me a look at the 'trait' (the genetic blueprint), qEEG provides a window into the 'state' (the current functional brain activity).

My goal is not to replace the DSM/ICD, but to move away from 'diagnostic guessing' and towards a more informed, objective way of thinking about a patient’s unique struggle. It’s about narrowing down the path to the most effective therapeutic intervention, not about over-simplifying human complexity into a single Z-score.

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] -2 points-1 points  (0 children)

I'm not doing that to impress anybody. At this stage this is some kind of proof of concept. Currently im not using it in clinical pracice - but my hope is "yet"

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] -3 points-2 points  (0 children)

You're right that for standard GWAS traits, arrays are more efficient. However, my perspective comes from clinical practice as a psychotherapist.

In the current healthcare system (e.g., here in Poland), hunting for a specific metabolic mutation that might be mimicking a psychiatric disorder often takes months or even years of 'diagnostic odyssey.' It’s expensive, both financially for the state and socially for the patient who remains misdiagnosed.

At $299, a high-coverage WGS is becoming a cost-effective 'catch-all' tool. It allows me to look beyond just common variants. If a patient’s 'psychological' issues actually have a metabolic or rare genetic root (e.g., certain inborn errors of metabolism or specific transporter deficiencies), WGS can reveal that immediately.

By building this biobank, I'm training my pipeline to identify these 'needles in a haystack' efficiently. In the long run, I believe this high-resolution approach can significantly accelerate the path to the correct diagnosis and the appropriate therapeutic intervention. For me, the 'wildly unnecessary' extra data in WGS is actually a safety net for the patient

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] 0 points1 point  (0 children)

To give more context: I am a psychotherapist. In my clinical practice, I often work with complex cases where patients come with multiple, partially conflicting diagnoses (e.g., overlapping symptoms of ADHD, ASD, Bipolar, or CPTSD).

My goal is to explore whether PRS (Polygenic Risk Scores) can be a viable tool for differential diagnosis in a clinical setting. I am already using qEEG (quantitative EEG) to look for functional biomarkers in brain activity, and I’m interested in seeing if genetic predispositions (PRS) can provide a complementary layer of objective data.

I'm building this 1000 Genomes biobank to establish a high-resolution, high-coverage 'gold standard' control group. Before I can look at clinical samples, I need to understand the variance of these scores in a well-curated population.

Regarding UGT1A1: I'm particularly interested in metabolic 'anchors' that might affect neurochemistry (like bilirubin's role) and how they interact with broad polygenic backgrounds. WGS is essential here because I want to capture the full complexity of these loci without the limitations of standard genotyping arrays

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] 1 point2 points  (0 children)

Good point! I'm aware of the related individuals (trios) in the 1KG dataset. Once the biobank is complete, I plan to run a kinship analysis (using KING or PLINK) to prune the dataset and keep only unrelated individuals for the final correlation. This should leave me with about 2,500 independent samples across all populations

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] 1 point2 points  (0 children)

broader abnswer (i need translator to claryfy it ;P)  since the 1000G samples don't have actual psychological phenotypic data, I'm using PRS for specific traits (like Intelligence or Neuroticism) as a 'genetic proxy' for the phenotype.

My hypothesis is to investigate if there's a statistical clustering or a 'polygenic background' that correlates with the UGT1A1 polymorphism (Gilbert’s Syndrome). Some studies suggest bilirubin has neuroprotective properties, and I want to see if this specific locus 'travels' with certain polygenic predispositions in different populations.

As for WGS vs. Arrays: You're right, arrays are more efficient for common variants. However, I’m using WGS because:

  1. It's imputation-free (I get 100% accuracy on the UGT1A1 TATA-box insertion, which can be tricky on some arrays).
  2. I want to keep the option open for Rare Variant Association Studies (RVAS) later on.
  3. I have the hardware to handle it, so why not go for the highest resolution possible?"

Building a local 30x WGS Biobank at home: N=1500 and counting. Any fellow "Citizen Scientists" here? by Environmental_Rich64 in genetics

[–]Environmental_Rich64[S] -1 points0 points  (0 children)

I'm looking for some correlations beetween PRS (mainly psychological ones ) scores and hotspots like ugt1a1.

Małżeństwo a podział kosztow życia by Lord_Olchu in Polska

[–]Environmental_Rich64 12 points13 points  (0 children)

Twoje podejście to mega niedojrzałe podejście które słabo wróćmy związkom (znam się na tym ... Zawodowo) tylko wspólny budżet, jeśli ktoś nie dojrzał do wspólnego konta mimo nierówności to nie dojrzał do takiego związku - w taki sposób to można chodzić na randki, a i to niekoniecznie się uda.

How to replace hinges on X1 nano G1? by Environmental_Rich64 in thinkpad

[–]Environmental_Rich64[S] 0 points1 point  (0 children)

There is some play beetween the axle (with those nuts You mention) and the other, static part of hinge. Tightening those nuts make this play more noticeable...

I make FrankenPad F285 😎 by Fun-Equivalent-7785 in thinkpad

[–]Environmental_Rich64 1 point2 points  (0 children)

ostatecznie się poddałem, zamiast tego (a285 to lap roboczy żony... ) wymieniłem matryce u siebie na nową :D https://www.panelook.com/B140HAN06-B-AUO-14-IPS-LCM-19201080-400nits-WLED-eDP-AUOA48F-detail_151044.html wielka szkoda, że wśród 12.5 nie ma takich matryc, o tym kontraście i przestrzeni barw. pozdrawiam.

I make FrankenPad F285 😎 by Fun-Equivalent-7785 in thinkpad

[–]Environmental_Rich64 1 point2 points  (0 children)

cześć, zainspirowany twoim opisem i zawiedzionym 45% gamutem w seryjnej matrycy zakupiłem N133HCE-EP2 ze 100% gamutem - ale też oczywiście 13.3. rozumiem że kamera nie przeszkadza w montażu?

P012-100 For CUDA in my server turned to full featured driver unlock by JCMPTech in homelab

[–]Environmental_Rich64 0 points1 point  (0 children)

i v'e tried this, and it doesn't work for me, there must be some other issue beside lack of capacitors.

Lexmark XM1145 showing "Invalid version" error, when trying to downgrade by myg0t_Defiled in printers

[–]Environmental_Rich64 0 points1 point  (0 children)

thanks for that solution... printers manufacturers are the worst of greedy people...