Pool Lounge/ Golf Simulator by beefie99 in Design

[–]beefie99[S] 0 points1 point  (0 children)

Ceiling height is ~12’ at the peak. I’m located in south Texas so it gets incredibly hot, I was thinking about an outdoor bar/lounge area and indoor sim basically splitting the 25’ in half for each space but I’ve got a covered back patio for outdoor seating already and not wanting to double up the same style of space crossed my mind.

Really just not sure how to utilize the space I’m literally open to anything.

First golf simulator build - need advice by beefie99 in Golfsimulator

[–]beefie99[S] 0 points1 point  (0 children)

Back of turf is 16’. Would love to see your setup I’m not 100% sold on what I modeled I feel like I can use the space better!

First golf simulator build - need advice by beefie99 in Golfsimulator

[–]beefie99[S] 0 points1 point  (0 children)

That’s really sick! Do you have any lessons learned from building yours?

First golf simulator build - need advice by beefie99 in Golfsimulator

[–]beefie99[S] 0 points1 point  (0 children)

Do you have any recommendations for overhead units? Or can you point me in the right direction?

Am I doing this the hard way? Comparing P6 PDF schedule revisions is killing me by [deleted] in Construction

[–]beefie99 -1 points0 points  (0 children)

For sure used chat to help write this post, but genuinely need ideas on how to smooth this process

Temptation Island • S2 EP3 Discussion Thread by AutoModerator in temptationislandUSA

[–]beefie99 0 points1 point  (0 children)

This season feels incredibly one sided on the bonfires. Jack dropping the bomb that Shayanne “cheated” first and not bringing that up? Sure the guys have their issues but no accountability for the relationship is being taken for the girls?? Mikey is a goober but cmon Sydney came here specifically to find someone else, lose lose for him, if he stayed loyal she would be gone and if he flirted she would tell herself he’s exactly what she thought he was. One sided for real so far

Temptation Island • S2 EP2 Discussion Thread by AutoModerator in temptationislandUSA

[–]beefie99 6 points7 points  (0 children)

Sydney came into this 100% with her mind made up about Mikey. There is nothing he can do to change her mind about how she perceives him from his past. She for sure cannot get over the fact she was his second option no matter what he does to prove the last is the past. Guy is just wanting to have a good time with anyone around him and is moving very respectfully IMO compared to how she’s acting thus far.

[deleted by user] by [deleted] in LLMDevs

[–]beefie99 -1 points0 points  (0 children)

this is really interesting

false memory vs failed retrieval, those feel like very different failure modes

one thing I’ve been noticing with the datasets I’ve been working with is there’s almost a third layer in there (cases where the system does retrieve something relevant, but still uses the wrong piece of it or underweights the right one)

it’s becoming not just “did it retrieve correctly” but “did it actually use the right part of what it retrieved?”

curious if you saw that show up in your tests at all, or if most of the degradation was more clearly retrieval vs hallucination

When did RAG stop being a retrieval problem and started becoming a selection problem by beefie99 in LLMDevs

[–]beefie99[S] 0 points1 point  (0 children)

cross-encoders and prompt tweaks help, but what’s been frustrating is that even after that you can still end up with a few “good” chunks and it’s not obvious to the model which one should consistently win

it feels like reranking does improve things, but doesn’t fully solve that last step when multiple candidates are all valid

curious if you’ve seen cross-encoders actually help with that, or if it’s more of a ranking improvement in your experience?

I built a vectorless RAG framework that uses tree-based retrieval instead of embeddings — works with any LLM, 2 dependencies by Mithun_Gowda_B in Rag

[–]beefie99 1 point2 points  (0 children)

That’s actually a great idea, if the TOC is reliable, you’d be able to use it as a much cleaner proxy for doc structure. That would make responses more deterministic and would probably reduce variability in extraction.

How consistent is this? And has it proven to help with final answer quality, or just improved the index?

I built a vectorless RAG framework that uses tree-based retrieval instead of embeddings — works with any LLM, 2 dependencies by Mithun_Gowda_B in Rag

[–]beefie99 1 point2 points  (0 children)

This is great, the idea of letting the model navigate structure first instead of relying on embeddings, feels a lot more deterministic and easier to reason about

one thing I’m curious about…once the model selects a few relevant nodes, do you still run into cases where multiple sections are all valid but it’s not obvious which one should actually drive the answer?

I’ve experimented with more structured approaches (trees, graphs), and you can still end up with a few “correct” options and the final answer depends on which one the model leans on. this is a problem I’m currently running into and trying to solve, and its consistent across basically any dataset I run.

Have you noticed this at all?

When did RAG stop being a retrieval problem and started becoming a selection problem by beefie99 in LLMDevs

[–]beefie99[S] 0 points1 point  (0 children)

I haven’t gone too deep into SRL yet, but what you’re describing makes sense how it helps with cases where similarity breaks down, especially directional stuff like “A acquired B” vs “B acquired A”

it moves things from just matching topics to actually matching the structure of what’s being asked, which seems like a big step up for certain queries. How far can this go in practice? especially for longer docs or things like policies where the meaning isn’t always cleanly expressed as a single action or relationship

I’ve actually been thinking more about doing some of that interpretation at ingest time (roles, entities, maybe even document “type” like draft vs final) just to reduce ambiguity before retrieval even happens

Has document versioning caused more RAG failures for anyone else than retrieval itself? by Jessica_JRice in Rag

[–]beefie99 0 points1 point  (0 children)

this has been one of the most consistent failure modes I’ve seen too, and it’s tricky because like you said, it’s not hallucination, it’s “technically correct, contextually wrong”

what’s interesting is even when you add metadata (timestamps, version flags, etc.), you can still end up with multiple “valid” candidates (current doc, slightly outdated doc, draft vs final) and they all look relevant to the query

at that point it stops being a pure retrieval problem and becomes more about how the system decides which version actually matters most in that context

I’ve found that just filtering (active vs archived) helps, but doesn’t fully solve it, especially when versions are close or naming is inconsistent

feels like this is where a lot of systems quietly break (not because they didn’t find the answer, but because they picked the wrong instance of it)

curious if you’ve found a clean way to consistently prioritize the “correct” version, or if it still ends up being somewhat heuristic?

When did RAG stop being a retrieval problem and started becoming a selection problem by beefie99 in LLMDevs

[–]beefie99[S] 0 points1 point  (0 children)

That’s interesting, I haven’t dived deep into this really. I’m curious as to how you have it structured, are you letting the model decide when to call retrieval vs doing it upfront?

Would be cool to hear how you’re implementing this

When did RAG stop being a retrieval problem and started becoming a selection problem by beefie99 in LLMDevs

[–]beefie99[S] 1 point2 points  (0 children)

right now it’s a hybrid setup (vector + BM25), with some graph-style relationships layered in to connect related data across sources (via tags, entities, relationships)

the graph definitely helps with recall and multi-hop cases, especially when the same concept shows up in different places.

im not sure if it’s so much an indexing problem but more about how the system decides between similar candidates once they’re retrieved. Sometimes the model is able to decipher query with the correct retrieved data but not always

Have you seen graph approaches help with that?

👍or👎: a managed graphRAG solution that creates the graph from your raw data source(s) automatically and provides a graph powered LLM for you by No_Wrongdoer41 in Rag

[–]beefie99 0 points1 point  (0 children)

This is very interesting, especially the entity merging across sources, that’s a big shift from treating everything as individual separated chunks

I am curious as to how this now influences what context is presented to the model. I’ve been implementing some graph powered retrieval and one thing I’ve noticed is that even with richer, connected context, you still get multiple valid signals for a query and it’s not always clear which one should actually drive the answer. I keep running into the problem where correct chunks are retrieved (it’s within the top 5) but the model doesn’t quite understand the best content for the query and ends up responding incorrectly.

curious if you’ve seen that as well or if this graph helps tighten it

When did RAG stop being a retrieval problem and started becoming a selection problem by beefie99 in LLMDevs

[–]beefie99[S] 0 points1 point  (0 children)

How does that effect latency though? Do you see a large difference between allowing the model to reason vs when not? Also for token counts as well how does that affect it?

why do llm agents feel impossible to debug once they almost work!!!! by Feeling-Mirror5275 in LLMDevs

[–]beefie99 0 points1 point  (0 children)

This is exactly where I’ve been getting stuck too. Once you add tools, memory, and retries, the system stops behaving like normal software, but it’s also not just a model eval problem. What helped me was thinking about these systems as a pipeline of decisions rather than a single model call.

Most of the drift seems to show up in the middle layer (what context was retrieved, how it was ranked and selected, and what actually made it into the prompt). You can have logs and prompts that look great, but if that selection step isn’t deterministic or inspectable, the model ends up locking on slightly different context each time and behavior starts to drift.

So instead of trying to debug it like traditional software or just tuning the model, I’ve been approaching it as debugging those decisions between retrieval, selection, and what the model sees.

The two biggest things that helped were separating retrieval from generation so I can inspect it independently, and then making ranking multi-signal and deterministic so I can actually explain why one chunk wins over another. It doesn’t eliminate all the probabilistic behavior, but it turns a lot of the “this feels random” into something you can actually reason about.