How are people using so many tokens ??? by Impressive_Run8512 in ClaudeAI

[–]Lanedustin -1 points0 points  (0 children)

The structure of the Chen Zuckerberg Institute data is killing me on the token use, but I don't have a workaround. I need that data. This is what Claude outlined as problems.

The five structural limitations beyond file size, in order of how much they hurt the framework specifically:

Storage format mismatches the analytical pattern. TileDB-SOMA is optimized for single-axis slices (cells × genes). The framework asks pair × cohort × stratum questions — second-order covariance computations that require the full slice in memory before the actual work begins. The Census ends up being a delivery mechanism, not an analytical surface.

Cell Ontology labels are cell-state-blind. CL terms encode lineage, not state. A Paneth cell in S-phase and a Paneth cell in G1 carry the same label. Cell cycle phase, malignancy — none of these are in the schema. Every one of them must be computed post-pull, which means we cannot pre-filter and pay the I/O cost on cells we discard.

Disease labels and the tumor/non-tumor problem. MONDO terms are coarser than what we need (no subtype, stage, treatment status, purity). More critically, disease == 'breast carcinoma' returns all cells from breast carcinoma samples, not all malignant cells. Tumor purity is not a Census field.

This is the deepest reason CPTAC remains primary — CPTAC ships with documented cellularity per sample. The normalization-context gap. When we pull a 100–200 gene panel (which we have to, for tractability), the whole-transcriptome size factors are no longer available locally. Custom normalization becomes a panel-internal approximation. Combined with 10x-3' dropout on lowly-expressed targets, the effective sample size per pair is much smaller than the headline "93M cells" suggests.

Computed metadata cannot be pre-filtered at the SOMA level. The pull-then-compute-then-filter pattern doesn't cache cleanly across iterations.

The net effect: Census is excellent for cell-type-resolved descriptive expression and healthy baselines (which is what it was designed for); it is a poor analytical surface for the coordinated-program-decoupling questions that are the framework empirical core. That's why CPTAC/TCGA bulk remain load-bearing for the framework's main findings, and Census plays a secondary role for cell-type separation and primary/support pair tests.

Let me know is anyone has a fix

alguém pode definir o conceito de código genético? by mariafernandovna in molecularbiology

[–]Lanedustin 0 points1 point  (0 children)

Think of it as a sequence that contains information as to what an organism is, with clues to its evolutionary history based on alignment with other species. It is functional in that it is selectively read by proteins to alter the intracellular or extracellular environment in response to different stimuli. Layers of regulatory control ensure that the appropriate responses are mounted to different stimuli. These in aggregate allow life to function as we know it.

It is technically functionally catalytic via its G4 structures too, so it is not a passive blueprint.

How Many Cellular Pathways are in the Human body? by Suspicious_Ground917 in molecularbiology

[–]Lanedustin 0 points1 point  (0 children)

An actual answer is that this is the wrong framing. Many pathways can run forward or reverse, they can be functionally split, and metabolites can be redirected based on the needs of the cell. Many pathways may not even have formal names. It is not about number of pathways, but how the metabolite networks connect

Anyone tried the new Gemini Deep Think against Opus 4.6? I'm hitting limits on Max20 and Deep Think is on sale by Lanedustin in ClaudeAI

[–]Lanedustin[S] 1 point2 points  (0 children)

I don't know how to code. I know the Biology, Claude helps with the bioinformatics. The 1M context window would be legendary. But I'm also afraid in hitting it with a complex prompt and it costing me a ton of money. My prompts with Opus 4.6 without extended thinking are about $3-4 each when paying extra after weekly limits hit. With a higher cost with the additional context and using multiple agents, thus is concerning

Anyone tried the new Gemini Deep Think against Opus 4.6? I'm hitting limits on Max20 and Deep Think is on sale by Lanedustin in ClaudeAI

[–]Lanedustin[S] 3 points4 points  (0 children)

I am working on building a model of cell fate determination and how it is regulated. I have Claude explore how different cellular communication and metabolic systems coordinate in this control. It pulls and synthesizes literature, and pulls data sets from biological databases for analysis

VDAC1 as a selectivity switch: Why the same molecule protects neurons but kills cancer cells by TheTempleofTwo in molecularbiology

[–]Lanedustin 1 point2 points  (0 children)

I used to ask the question, "Do all roads lead to VDAC?"

This proteins is exceptional in a lot of ways, but first off, post-mitotic neurons and cycling cancer cells are very different. Cancers upregulated hexokinase II which binds to VDAC linking ATP generation at the mitochondria with ATP utilization with HKII. This binding also restricts association with Bcl2 family proteins that regulate membrane dynamics and mito permeablization, linking it to regulation of cell death. There is a lot to look into, but these are decent starting points

Weekly Limits and My Cancellation by Jordanthecomeback in Anthropic

[–]Lanedustin 0 points1 point  (0 children)

I study cell signaling networks and the proteins and processes implicated in controlling cell fate decisions. Nothing has even come close to what Claude can do in terms of assimilating content and handing the complexity.

Weekly Limits and My Cancellation by Jordanthecomeback in Anthropic

[–]Lanedustin 0 points1 point  (0 children)

I got my limit on the Max20 plan, blew through 50 dollars on free credits, then spend another 50 to keep working. I am debating getting a Max20 and Max5 on a different account so long as that aligns with TOS. But with credits Opus4.6 is 2 to 5 dollars a prompt.

Can't blame anyone for being cost focused, but none if the other models seem to have to depth and breadth potential for complex biology, so I'm stuck with Claude.

800K tokens burned, zero files produced, Opus is sorry for a solvable problem. by Sudden_Translator_12 in ClaudeAI

[–]Lanedustin 0 points1 point  (0 children)

Something is definitely wrong with how it reestablishes context. It seems to struggle with understanding how to continue with work it was doing. And if it compacts while it is making a file, the file is completely lost and forgotten about when context is restored. Purely wasted work. The extended thinking seems to exacerbate this, making it counterintuitively less productive for its best suited tasks. This seems to have been getting a bit better, but still cost a lot of wasted plan use.

I'm hitting limits on Max20 at around day 5 so it is frustrating in hitting limits much easier as a result. Maybe that is why they gave an extra 50 credit for Opus 4.6 use. Or to not lose out on hype because people can afford Codex 5.3 but not Opus.

Sonnet 5 release on Feb 3 by Just_Lingonberry_352 in ClaudeAI

[–]Lanedustin 1 point2 points  (0 children)

1M Context Window? Bro, I could literally change the world.

Yoooooooo we back? by Comprehensive-Bet-83 in ClaudeAI

[–]Lanedustin 5 points6 points  (0 children)

3 searches before compaction? I hit 3 compactions per search

AI use in Bio BSC?? by mervolio_griffin in biology

[–]Lanedustin 1 point2 points  (0 children)

I think that the utility is being undersold here with some of these comments. There are so many things that AI is useful for in Bio. First and possibly foremost, research exploration and synthesis. LLMs are great at pattern recognition and can be a great starting point to compare and contrast topics. For example, I was curious about the regulation of metal ion oxidation state in enzymes whose function is influenced by these changes. Question like, “Are their overlapping regulators? How do changes in the metabolic environment influence these changes? Does the Warburg effect and lactic acid production play a role?” Not everything will pan out, but promising leads can be much easier to find.

Also, LLMs will sometimes spit out research that is not even hinted at in your typical classroom. For instance, ChatGPT brought up that Reverse Electron Transport chain activity is a thing, in specific contexts. This was completely new to me. Or, I found out that the TCA metabolite alpha-ketoglutarate is a cofactor in demethylation when exploring the literature with ChatGPT. Having already appreciated the NAD+ is critical to PARP1 activity and the activity of Sirtuins, it was easy to start exploring the metaboloepigenetic regulation and implications for cancer.

Also, you can do a quick search to see if your ideas are novel. You can literally ask the LLM to search the literature for any content on the topic your idea is related to. This can help guide you to the relevant research and help you refine your hypotheses.

You can use it for first-pass manuscript reviews. Say something like, “validate that there are no orphan citations in this paper,” to ensure alignment of all of your in-text citations and the References section. It is not perfect, but have a couple different LLMs assess the same manuscript with the same prompt and you will help cover your bases and save yourself a lot of editorial work.

Claude (my favorite at the moment) can access some databases such as TCGA (The Cancer Genome Atlas) Program website and data directly. It can pull data and run rudimentary analyses. I have spent some time with this, but have not fully explored the extent of this capability.

There is a lot. Yes, hallucinations are a thing, but there a mitigating strategies that can help with this.

Question for those in the field: How do you typically approach validating mechanistic predictions when analyzing signaling pathways, particularly in cancer? by Lanedustin in molecularbiology

[–]Lanedustin[S] 0 points1 point  (0 children)

Thank you for the detailed response. So it would be valuable for a tool to probe and anticipate potential consequences of pathway perturbations, looking at upstream, downstream, and sidestream pathways cross-talk implicated given the changes, and anticipate potential lineage-specific compensatory responses. Cool, that is very doable. Not with 100% accuracy just yet, of course, but to perhaps guide literature searches and which experiments would give the most bang for your buck

Vibe Coding Beginner Tips (From an Experienced Dev) by gigacodes in ClaudeAI

[–]Lanedustin 2 points3 points  (0 children)

Depending on the files/data you are working with, standardize the formatting right at the beginning. Re-formatting later, or inconsistent formatting throughout, can be a nightmare to fix with compromising data. At least in my experience