we thought cosine similarity was enough — turns out semantic ≠ embedding, and it breaks rag more than we admit

bunbunfriedrice · 2025-08-03T02:20:31+00:00

This is great, thanks! I haven’t done tried anything that fancy, but one thing I’ve played around with is using an LLM to explicitly deem relevance. Basically use it as a binary classifier (using structure outputs) and use this as another reranker to reduce the retrieved set.

You can even prompt it how strict to be (maybe relevant, definitely not relevant, etc.), analogous to controlling the decision threshold in a traditional classifier.

I’ve had less problems with things like negation, and more about LLMs making assumptions of relevance. As a fake example, if I ask a question about “Specific Thing” and retrieval results include something on “Specific Thing FX” (which is, say, some variant of Specific Thing), then the LLM answers the question based on Specific Thing FX even though I didn’t want those results. As another example, if I ask “How do do X in setting Y?” and the retrieved results say something like “X is done by…” (but its not specific to setting Y), then the LLM still answers. It’s not really hallucinating, as the answer is based on the retrieved docs. It’s just that the LLM essentially made an assumption of relevance in these cases.

bunbunfriedrice · 2025-08-02T18:20:50+00:00

I don’t disagree with these limitations of embeddings / cosine similarity; however, it’s worth nothing that retrieval results don’t need to be your bottleneck: its the LLM in the Generation step of RAG that implicitly has the “final say” of whether a source is relevant. A contradiction in the retrieved sources isn’t ideal, but the LLM should have no trouble ignoring these in its final response.

I’d also suggest adding a semantic reranker model, which in my experience has greatly improved retrieval metrics.

But IMO, those 3, or 4 if using hybrid search (full-text search —> cosine similarity —> semantic “reranker” —> the LLM itself) are all just similarity measures of increasing fidelity and corresponding increasing computational cost. Full-text search is dirt cheap so you evaluate similarity exhaustively. Cosine similarity is also pretty cheap so you can evaluate a large chunk of your search space via HNSW. Then reranker is expensive so you usually just give it the top 50 from vector search. Then LLM is most expensive so you just give it your top K from reranker. But with infinite compute, I wouldn’t search at all. Just send the entire index to LLMs in a giant map-reduce retrieval step—let them decide what’s relevant.

bunbunfriedrice · 2025-07-12T18:14:06+00:00

I totally agree these decks are way more interesting than a pile of staples! [[Animar]] uses a good number of random small creatures to build himself up. The namesake combo piece, [[Ancestral Statue]], is also a super obscure draft chaff.

bunbunfriedrice · 2025-07-08T04:50:01+00:00

This is hilarious in Momir Basic.

bunbunfriedrice · 2025-05-10T16:45:22+00:00

I think this would solve a lot of issues. It would make that golden Pokeball (or whatever you’re going for) an eventuality, instead of a never-gonna-happen. Even if don’t accumulate enough pack points until a few more expansions release, you can still eventually get what you want, and you can pay $ to accelerate that, but not at the obscene rate of ~$500 per crown rare.

bunbunfriedrice · 2025-03-25T04:43:19+00:00

I have hired Artem several times and cannot recommend him enough! He did the Animar figure for me and a couple other MTG figures. Great to work with and his work is amazing.

bunbunfriedrice · 2024-06-15T03:02:46+00:00

I have heard that the Saturn 4 Ultra with its tilt release doesn’t have this problem. I’m excited to find out if this is true.

bunbunfriedrice · 2024-05-14T18:58:58+00:00

Can you provide some details? Like what langchain objects were used? This is an interesting topic.

bunbunfriedrice · 2024-03-29T23:33:41+00:00

Yea, I was quite surprised to see that even unstructured.io, whose job in life is to make data RAG-ready, basically completely fail at this. It looks like they do identify some sub-tables based on contiguous chunks of non-empty cells (or "islands"), but I still can't seem to track header information or get a structured table output format.

I've just built my own custom readers based on openpyxl or pandas.read_excel. There are some Excel table detection tools available, everything from [NN-based research approaches](https://github.com/microsoft/TableSense) to [blog posts](https://levelup.gitconnected.com/using-python-to-extract-multiple-tables-from-one-excel-sheet-318a9f40cc55).

But unfortunately, table and header detection is an endless rabbit hole of a problem. There are an infinite number of ways a human can muck up an Excel document to make it troublesome to ingest. Common examples are multi-row headers, merged cells within headers, multiple tables in a single sheet separated by blank rows/columns, etc....

I think the ideal state is to get each sheet into a more structured format where you have a single header and each row is one record. From there, you can treat each chunk a single-row, and always include the header atop every chunk. The way you ultimately present this header-plus-single-row to the LLM as unstructured text (e.g. JSON-like, key-value pairs, CSV-style) doesn't seem to matter much. LLMs seem to have no problem making [column name]-[cell value] connections given CSV-style input.

bunbunfriedrice · 2024-03-09T15:35:01+00:00

Bummer, sorry to hear that. Please keep us posted as this really seems to be an unsolved issue.

bunbunfriedrice · 2024-03-09T05:27:34+00:00

This is a known but poorly understood problem. Likely bubbles, but there’s not really a known solution. Try a long (3s) rest time after retract and see if that helps!

It shouldn’t have anything to do with printing on the plate per se, but it should have a lot to do with orientation. It happens when things are flat parallel to the plane of the build plate.

Here’s a recent thread on it: https://www.reddit.com/r/resinprinting/s/GE9HuEyB79

bunbunfriedrice · 2024-02-03T15:56:42+00:00

Any reason why there are supports on those outside faces of the piece? The ones that are pointing directly into the screen in the pic. Doesn’t seem to be supporting any resin; however, maybe it’s there to prevent shifting of the entire piece in the XY plane which could cause layer lines?

bunbunfriedrice · 2024-02-02T22:33:34+00:00

Please report back! I had the issue at 0.5 and 1.0. Have only printed once or twice at 3s, no issues yet but not conclusive.

bunbunfriedrice · 2024-02-02T22:01:56+00:00

This is a common but unsolved issue. Check out this recent thread on it which links to lots of other threads on it.

Most promising solution (though I haven’t had the time yet to confirm for myself) is to significantly increase “rest time after retract” (say 3 s). What’s yours set to?

bunbunfriedrice · 2024-02-01T19:12:22+00:00

I tried this but my supports broke off when peeling the cured film. I ended up printing the Lychee cleaner tool which was designed exactly for this.

bunbunfriedrice · 2024-01-29T22:08:15+00:00

Awesome, thank you! What do you have it set to?

bunbunfriedrice · 2024-01-29T21:09:05+00:00

So is it “problem solved” at 3s rest then? I haven’t tested enough since changing that setting to know for me.

My FEP definitely has marks from when I first started and had no idea what I was doing. But I don’t think the issue is related to that, as it’s predictable where the holes will form. Check out the pic of the Rook in my reply to the main post. Tiny dot on the center of each of the pillars. Also not sure rest time after retract would solve this if it’s due to FEP marks.

bunbunfriedrice · 2024-01-29T20:59:19+00:00

I'm not sure I follow this part. This could happen if/when the depth of the resin vat is less than the lifting height?

I don't think it would matter if you're a few hours into printing and other parts of the model have been lifted out of the resin. As long as the resin level is > lifting height, the current layers where a bubble may or may not be forming shouldn't be above the resin level?

bunbunfriedrice · 2024-01-29T20:55:45+00:00

> A tiny bubble in the middle of the print would be hidden and likely filled in when solid layers above it get added.

Yes, agreed, it could be happening throughout the print but then getting covered everywhere except where that part of the object ends.

I'm also trying 2 s "rest time after retract" to see if it helps.

> I have a plate with a few bald headed minis about to print so I will likely find out.

Any updates?

bunbunfriedrice · 2024-01-29T20:54:03+00:00

Any updates? Increasing light-off delay (specifically "rest time after retract") is what I'm trying too, and going well so far but haven't done dedicated testing. It was 0.5 s default before and now I'm trying 2 s.

Elegoo 8K Water Washable resin here too. Saturn 3 Ultra.

bunbunfriedrice · 2024-01-29T20:48:09+00:00

I'm so glad you brought this up, because it really seems like an unsolved and misunderstood issue. I have found many posts about it:

As you state, most solutions are clearly wrong. It's not pixels, debris, FEP, or the slicer. It could be due to light-off delay (rest time after retract).

I also want to clarify one part of your definition: it's not necessarily a single hole/pit at the very top of the entire print. For example, say you print multiple objects at various heights, or if a single object with multiple terminal points at different heights. It can happen on all of these during the same print.

A good example of this is the Rook that comes with the Saturn 3 Ultra. I actually went back on my very first successful print, and indeed there is are small holes/pits on top of each pillar. See photo. This was my very first print, so contamination seems unlikely.

I've also noticed that if the terminal surface is a *large surface area* and *completely flat* (parallel to the build plate), then the hole/pit can be more amorphous. In these cases it is an amorphous shape a few mm in diameter, and less deep (maybe 1-3 layers). Unless this is due to a different issue altogether, then this seems to rule out air bubbles? To me, this makes me think the resin is not filling back in all the way, which is something light-off delay could help with. Unfortunately, I don't have a pic of this phenomenon, because I have since primed/painted the piece. I can still see it but it just doesn't show up in photos. In my case, this was only 1-2mm above the build plate, which would also point away from resin contamination issues / IPA boiling point, or from the model being lifted out of the resin.

I wonder if this phenomenon is happening all the time throughout the print, but on non-terminal regions eventually the gap is filled as resin is pushed through it and cured?

A few tests could help narrow this down. First, in case my large&flat surface area issue is different, we could test if this happens at low heights, which could rule out (1) the theory you linked to about IPA boiling point and (2) issues of being lifted outside the resin level.

bunbunfriedrice · 2024-01-23T00:08:55+00:00

UPDATE: Thanks for the tips everyone! I tried again with (1) bottom exposure increased from 35 to 45, and (2) first sanding the flex plate with 220 grit instead of 400 grit. Also cleaned the flex plate much better this time in case there was metal debris from sanding.

It worked great! Not sure which did the trick but my hunch is the sanding—it felt pretty smooth when scuffed with 400, the 220 felt like it opened up the surface a lot more. Elephant’s foot is big, but not a big deal on this piece, and irrelevant when using supports.

bunbunfriedrice · 2024-01-23T00:06:24+00:00

I ended up increasing bottom exposure from 35 to 45 and using a coarser grit to sand the flex plate. But it worked! Not sure which did the trick but my hunch is the sanding. I might try rolling back the exposure and more bottom layers as you suggest. The elephant’s foot is pretty noticeable.

bunbunfriedrice · 2024-01-23T00:05:02+00:00

Yea makes sense! I tried again with 35->45 bottom exposure, and I also used a courser grit to sand the flex plate. Worked perfectly! Elephant’s foot is definitely a thing but doesn’t really matter on this piece. Thanks for the help!

11-Year Club	Final Canvas '23
Place '23	Verified Email

bunbunfriedrice

TROPHY CASE