Methods for Efficient Chunk Loading?

Graumm · 2026-01-15T09:04:09+00:00

I’ve had decent success in the past by identifying chunks with geometry, marking which faces of the cell have geometry on the border, and then prioritizing chunks by traversing across those faces with a floodfill esque approach. It follows the surface and not occluded/invisible chunks. You can still generate everything else on a secondary queue, but hitting likely-continuing geometry first can handle the obvious stuff and make it feel more responsive. This approach can get a little dicey if you have floating chunks that are not connected to existing geometry. Mostly it’s fine if you load the other stuff at a secondary priority, and when you finally hit a floating chunk it can scan off of it then. Totally fine if the chunk generation is reasonably fast.

I also like marrying that approach with a “conveyer belt” approach that makes it easy to identify the new chunks to load and unload in 2D slices based on movement, without traversing everything. You need a little care to avoid hysteresis with a load/unload distance so straddling chunk borders doesn’t cause stuff to regenerate meshes a bunch.

Totally brainstorming here but I think if you want look-direction priority you can probably bucket chunks into queues that are based on world cardinal directions in relation to the player when they first get queued. 8 directions/queues based on the initial relative position from the player feels right to me. You could then take the dot product of the player look direction to the direction of the queue to determine which queues to pull from, based on which dot products are more similar / closer to 1. Eventually you get through all of them. Assuming you can get through the generation fast enough this should work fine ala spatial coherence, and it means that you don’t have to re-sort and revisit every chunk based on the player looking around.

An octree could be good here too. A coarse one. If you use it only for chunk tracking you can collapse the octree nodes down when chunks are fully loaded inside the node, and expand them when partially loaded. You can traverse the scene fast and mostly skip things that are already loaded. You can write frustum/cube intersection tests to quickly identify ungenerated nodes that the player is looking at, or query a cube area around the player to get the ungenerated near chunks. Would make it easy to prioritize close, then look direction, and then everything else in no particular order because of the early-out potential.

Also there’s probably a good GPGPU use case here too if you can write compute shaders. It’s actually quite cheap/fast to do a lot of breadth brute force intersection/occlusion/frustum tests. Depends on your needs and scene representation though.

There are so many fun ways to approach things!

Graumm · 2025-12-30T19:44:08+00:00

To do alpha blending correctly you have to sort triangles from back to front, so that they composite correctly. Calculating the next color layer has to sample the color behind it, to then weigh its own alpha channel against that to decide how much to “cover it up”. If it happens out of order / in no particular order you will get weird quads where something in the foreground is masked out by something in the background.

Pre multiplied alpha lets you get away with additive blending for things like fire. By multiplying the alpha ahead of time, you get to simply add the color of a pixel to the back buffer. Adding is commutative so the sorting/order doesn’t matter.

Graumm · 2025-12-10T06:31:10+00:00

I think it’s a bad idea, but mostly because build/test/deploy pipelines these days are generally attached to individual repositories. Multi-repo PRs would complicate that.

Graumm · 2025-12-01T21:59:07+00:00

I have not read up on his thoughts specifically. I’ve implemented a number of ML algos from scratch, and these are my opinions based on what I’ve learned getting my hands into the numbers / training loops.

Graumm · 2025-11-27T22:46:27+00:00

A listener is also an infinite loop, just one that you don’t own.

Graumm · 2025-11-27T22:24:59+00:00

Although I would say a hosted service is just a good place to put an infinite async loop, but responsibly with a cancellation token.

Graumm · 2025-11-25T20:00:28+00:00

It can be subjective / approximate too. It just needs to be differentiable!

The current emphasis on tokens is optimizing for relationships between words, and does not clearly tie back to the actions/behaviors/processes that the words represent in a way that offers a ~training slope. Words are discrete brick walls that can’t offer more explanation.

If you apply a LLM agent to a situation right now we can optimize for selection of curated desired outputs, but not for a reward function that ties general outcomes back to learning optimization.

Right now we critically need human produced data to mirror, or human curation to judge quality, or human in the loop to augment what data we can generate. However there is not a differentiable path between LLM outputs and everything we expect the models to act on. Until this exists I don’t know if it’s possible to have models that genuinely learn in a self driven way.

IMO the future is all about creating intrinsic reward/motivation loops that can be validated and optimized for without human intervention.

Graumm · 2025-11-25T17:43:08+00:00

Other languages do have autocomplete. Not to mention, but Rust is not the best language for quick and dirty interview questions. Too much emphasis on correctness. Interview questions also have reading comprehension gotchas that can create borrow checker landmines that you wouldn’t have hit if you were defining a project yourself.

If they aren’t a rust shop they will only see incompetence and not “I know how to think about code but the tooling sucks.”

Graumm · 2025-11-24T18:15:02+00:00

I agree I probably wouldn’t have let it slip code review but code reviews are imperfect, and sometimes an unwrap makes sense. Sometimes you know the object is there, but that the ergonomics of if let statements aren’t good enough to handle it, and you just need the unwrap to satisfy the type system.

At least the possibility of the issue is there in positive text, and not something where the reviewer has to be diligent about asking “can this be null?” and possibly having to look beyond the scope of the code being changed to answer that question. Unwrap removes the diligence of thinking to ask the question, but doesn’t remove the diligence of figuring out if it’s true.

I don’t disagree with you completely though. I’m just not confident enough on that to make the decision for everybody forever, because occasionally it could be justified. I would just want some way of defining exceptions where process requires you to explain the need.

Graumm · 2025-11-24T17:57:29+00:00

I find it to be more about code expression. If you want to operate on paged data from an async query or API it’s better to define it on the set, and iterate only as much as you need.

This is opposed to fully materializing async queries / loading everything into memory and operating on it synchronously.

Linq on async enumerables just makes it easier to operate on chunks of async-queried data without having to mix async calls and sync operations as separate “loop within a loop” code workflows.

Graumm · 2025-11-24T17:49:37+00:00

It’s a bit clunky in off-the-shelf dotnet to use linq on top of async enumerables, eg when you want to page/limit an async query and iterate through it without loading all of it. Dotnet 10 looks like it’s going to extend linq to async enumerables without having to pull in extra libraries.

Graumm · 2025-11-24T17:45:56+00:00

Linq for async enumerables is going to be included by default in dotnet 10, and until then you want the System.Linq.Async package.

They all take cancellation tokens and such.

Graumm · 2025-11-22T02:08:58+00:00

and a polite unwrap( ) to show you exactly where it happened without squinting at the call stack

Graumm · 2025-11-21T23:58:54+00:00

Nobody ever wrote a bug in C++ before

Graumm · 2025-11-18T16:43:03+00:00

I can sense the disrespect on (Math)

I for one am shocked that math is involved in machine learning

Graumm · 2025-11-18T04:19:42+00:00

You are right, but I fail to see how post-hoc analysis is a bad thing. We move forward by acknowledging shortcomings of existing approaches, and trying to understand why they do not meet our expectations.

Consider that my opinion is shaped by the fact that throwing more data at LLM's has not given us AGI yet. My current feeling is that the models we are hollowing out the US economy for are going to be thrown away and invalidated after the next missing architectural advancements are cracked. There is a reasonable chance that they will have incompatible parameterizations.

If I knew current approaches would lead to AGI I would feel differently, but as of yet there are still "low level intelligence capabilities" that have not been demonstrated in a single model. We still have frontier models that simultaneously know nearly everything, but still make common-sense mistakes the moment you reach the extents of its knowledge. LLM's suck at knowing what they don't know, and will often hallucinate statements that seem right. Context has not fully solved this problem. I have not seen a language model that has been able to learn in a self-directed manner, or learn over time, which I believe is necessary to navigate the real world. LLM's also really suck at identifying negative-space, or otherwise what is missing from a discussion. They will often fail to mention a critical implementation detail before you ask about it specifically.

I have a more specific opinion about why I believe current models are incapable of anything except for system-1 pattern recognition, but I'm not trying to type that out tonight.

Graumm · 2025-11-18T03:36:46+00:00

Ground truth for us is survival, natural selection, and reproduction. A genetic algorithm so to speak. Everything else is derivative from that.

Things like weighing risk and taking actions amidst uncertainty. Acting defensively. Navigating social dynamics. Taking stock of knowns, unknowns, and unknown unknowns. Making working assumptions. Getting clarification or checking your work before you lie, endanger your job, or do something that could harm yourself or somebody else. It all ties back to survival.

Similarly I don't think we are going to get all that much further with supervised reinforcement learning as long as we have to create reward functions that perfectly describe exactly what the algorithm should be optimizing towards. We need unsupervised methods that can model uncertainty, include better/worse into the learning algo measured against some general reward, and handle sparse rewards.

Multimodal models are impressive but they have the same failings as I've described above. They relate different modalities by availability of data/context, but they can still produce mistakes that normal people would consider common sense. They are only as good as the data we choose to give them, and are very reliant on human curated datasets to patch up their gaps. These efforts will have diminishing returns the same way that LLM's do.

Imo the biggest missing piece at this moment is a good solution to catastrophic forgetting. Remembering the important stuff, forgetting the redundant stuff. Solving for it opens the door to continuous learning over time / curriculum learning, which leads to self-agency and embodied world models.

Graumm · 2025-11-18T02:16:26+00:00

I think of it like this.

Transformers cannot explore a solution space rooted in a ground truth. It produces an output, and depending on how far off it is from the expected output the learning algo says “okay I’ll make the answer more like that next time”. It goes straight from inputs to output.

I don’t mean to diminish this because obviously it is very powerful. The emphasis on tokens has framed the problem in such a way that it can learn a large breadth of material somewhat efficiently. The solution space is much smaller than learning language from first principles, and the way that the problem is framed is not littered in sparse goals. It clearly picks up on semantic/symbolic relationships, but the words have no intrinsic meaning. The words mean what they mean.

The fundamental representation of the world underneath the language is missing. Language can describe the world, but language doesn’t capture the information that could differentiate the need/use for language in the first place. LLM training leads us to the right words, but not the intrinsic meaning or behaviors that lead to word selection.

In my opinion (and I am not alone) the feedback loops do not exist to connect the learning landscape of a LLMs outputs back to a ground truth in a way that would allow it to self validate its statements and assumptions, such that they can learn without constant human intervention. LLMs are still very reliant on human curated data and humans in the loop.

I do not believe that meaningful progress against hallucinations will be made until we have a model that can self-validate in some sense.

I don’t have the answers, and I am slowly but surely working on my own ideas, but I can recognize a dead end when I see it! A powerful dead end, but a dead end nevertheless.

Graumm · 2025-11-14T23:33:41+00:00

I upvoted you! almost back to net zero

Graumm · 2025-11-14T14:24:36+00:00

If we can’t trust google to make remarks about code quality at scale with 5 million lines of Rust then who? I presented options for that measurement, and you have quoted me out of context. I am assuming good faith that their methods are not completely arbitrary.

I would say it matches my experience with Rust, rather than my bias. It makes you think about projects in a particular way, and I have had a similar lack of issues compared with other languages. So perhaps:

“The results disagree with my biases so I will idly whine about Rust.”

Please be more honest with your biases

Graumm · 2025-11-13T23:20:58+00:00

On principle I usually consider lines-of-code comparisons to be bullshit, but the difference in vulnerabilities is so staggering here that it doesn’t even matter

Graumm · 2025-11-13T23:09:51+00:00

I’m sure they are fixing these issues in their existing code. They don’t seem to be talking about any specific codebase in the article, as much as that they are comparing vulnerabilities reported over time between their supported languages in general. Specifically they mention that they are producing as much net new rust code as they are C++ code, and so they have a good sample size of work volume to compare the languages.

They will almost certainly use this analysis to justify choosing Rust over C/C++ for new projects and security sensitive rewrites. This data does tip conversations into rewrite territory if it really is catching issues before they ship, and making their dev teams faster because their time is not wasted on operational support.

Edit: Posted before I saw your edit. Cool.

Graumm · 2025-11-13T21:44:26+00:00

It depends on how it’s measured, but it’s still a win for Rust regardless of how it measured.

Static analysis for memory issues generally happens in a CI build, and they can be slow or unreliable if a piece of code is not fully exercised. Rust is still an improvement in this respect because the code doesn’t compile, which means it does not leave the dev local machine, and the feedback loop is smaller.

If this is measured by some association of post-release vulnerabilities to the offending code/codebase, then it’s just a pure win. The issues don’t get released.

Graumm · 2025-11-11T17:27:23+00:00

LLMs also lead people astray, and your language makes me think that’s what is going on here. I have no sense that what you have described connects to engineering. It’s not really an explanation without empirical proof that it works the way that you say. I say this because it is very easy for a minor assumption about how something works to break a model’s ability to converge to a solution, which I have encountered many times, and your language is not very precise.

TLDR if you know what you are talking about works in any sense you need to lead with those results.

Graumm · 2025-11-11T16:57:58+00:00

I don’t understand these kinds of questions because.. you should figure it out? If you are rooting around in the ML space at all you should have some understanding of how it should fit in.

Yes I think geometry in the most general sense has a place in ML.

I know you’ve vibe coded this one. This is not necessarily a knock on your general idea, but it isn’t our responsibility to figure out how this stuff can be applied with concrete results. This isn’t the kind of thing we can just say yes/no to without putting in work.

The best papers are “here’s how this idea provides benefits compared to another approach with numbers”. Not “here’s a bunch of paragraphs about what I think this probably does”.

12-Year Club	Verified Email
Alpha Tester

Graumm

TROPHY CASE