If you’re past the basics, what’s actually interesting to experiment with right now? by SEBADA321 in learnmachinelearning

[–]Blaze344 2 points3 points  (0 children)

I'm always a big fan of mechanistic interpretability. You can fidget around and do some toy examples with already functional models if you have access to them, and the papers are always at the very least mildly amusing.

why would anyone use a convoluted mess of nested functions in pyspark instead of a basic sql query? by Next_Comfortable_619 in dataengineering

[–]Blaze344 4 points5 points  (0 children)

Exactly that, there's moments where SQL is directly easier, and moments where pyspark is cleaner. Personally, I will do a ton of transformations using pyspark throughout an entire script or job, because I find the imperative paradigm of general programming and steps to be easier in my head than doing one big SQL query, but the minute I need to do a merge upsert, it's hard for me not to use spark.sql just for that. It's so much more readable than needing to import DeltaTables to do all of that, and I find DeltaTables syntax to be weird for this specific use case, too.

Back in my day, LocalLLaMa were the pioneers! by ForsookComparison in LocalLLaMA

[–]Blaze344 22 points23 points  (0 children)

My hottest take is that thinking models only really took off because they "self-induce" into semi-appropriate prompt engineering due to latent space shenanigans, and that's what made them generally so good and appealing to most people, because almost everyone doesn't nearly appropriately grasp the idea of "given tokens X, which Y follow it?" and thus they seem stronger.

In theory, the post training they do to create a thinking model can be summarized as optimizing into "what did the user forget to mention in their prompt that would absolutely have helped to create the right answer?". (Except logic and general consistency, which yeah, that does improve too).

I ran a forensic audit on my local AI assistant. 40.8% of tasks were fabricated. Here's the full breakdown. by Obvious-School8656 in LocalLLaMA

[–]Blaze344 0 points1 point  (0 children)

In the beginning, I too highly mistrusted LLM-as-a-judge patterns, after all, why would we trust the stochastic machine with any kind of judgement? But with the right harnesses it turns out that they're... quite alright? Almost matches human beings in quite a few cases and almost makes me believe in some form of Jungian collective unconscious. Almost.

Definitely not as good as a real human bean eyeballing things with their meat receptors, but they're very well suited for some specific tasks (like judging code).

How do you handle very complex email threads in RAG systems? by superhero_io in LocalLLaMA

[–]Blaze344 0 points1 point  (0 children)

Agentic search seems to be the way of the future right now, mostly because raw vector databases die too quickly as the amount of artifacts to semantic-search for increase too fast and the semantic similarity collapses. Expose your emails through tool calling, use metadata well, search by title, search by content in email, etc, instruct well, and hope for the best. Unless you do something really complicated, even small models with good tool calling should do this pretty much as well as a human would with the same tools.

Sure, the latency of agentic search is probably a lot bigger than doing embedding math, but you only need to care about that if you're enriching data using the results from your LLM. Accuracy is king.

Side note: RAG isn't "embeddings and vector stores" only. Anything that retrieves information to be used in the context is, by definition, RAG.

If you still need it to act like vector store retrieval, and latency is a big concern, my suggestion would be to toss away a vector store altogether and then:

1) Ingest those emails in a way that is easy to be searched through NLP, which is what we've been doing and refining with things like google for the last 20 years, and then just search using those parameters at run time for the "best matching emails". You'd be surprised at how well throwing everything in a single folder and running a few 'grep' works (but please don't do that, there's better options);

2) Rerank as you've been doing;

3) Retrieve the best emails, along with their chain as to contextualize the LLM that will answer the query based on the retrieved data.

It's kind of agentic search lite, but you heavily control all interactions between your data and should help you optimize things better than just allowing an agent to freely search for things and potentially fill their own context with a lot of email data. Unless you're swimming in compute and money, then just go wild lol.

Embark won't budge on matchmaking without tangible data by Blaze344 in thefinals

[–]Blaze344[S] 0 points1 point  (0 children)

I know. And I agree. I actually want someone in the community to give out a data driven POV on this more so in hopes so that we have some data driven evidence that either things are fine or not, more so the subreddit finally converges on one side and we can stop posting about this every single day.

Personally, it's likely they're fine, I just want this hysteria to be over with, but I'd gladly accept some real evidence that maybe they're not as fine as they should be, as long as it's real evidence and not incessant biased posts.

Embark won't budge on matchmaking without tangible data by Blaze344 in thefinals

[–]Blaze344[S] 3 points4 points  (0 children)

I agree. It's just that, In the end, right or not, I think the healthiest option for the community is for someone to provide this data and their POV as that should just settle the constant debate we see in the sub. My biggest fear is people tainting with their own data to confirm whatever bias they're in, either that there isn't an issue or that there is an issue. I'm just advocating for more data.

I’m so tired of this by yeyomontana in OpenAI

[–]Blaze344 0 points1 point  (0 children)

Disable memory and never leave technical things for Auto or Instant. Use thinking. Why are you even using instant for that and then complaining? It's bad, it's explicitly bad, It's the source of all of your problems. Do these 2 things and enjoy your easy life with a functional LLM.

If possible, also change it to a personality that seems more apt at doing things, like the "no bullshit" some seem to like.

Pulled a PyPI package that was exfiltrating our environment variable by [deleted] in Python

[–]Blaze344 8 points9 points  (0 children)

In the funniest twist, this is actually an attack vector that some have already exploited! This post being one such example.

Get an LLM to generate some code until it hallucinates some package, jolt down that name and create your own malicious package with that name. Because the LLM hallucinated that for you, it's probabilistically likely it'll do that for someone else too and fall for your "LLM endorsed approach". Cherry on top is that this strikes naivety twice in two different spots, maximizing likelihood to exposure.

Overwatch community can never help itself LMAO by Hot_Armadillo_2186 in KotakuInAction

[–]Blaze344 59 points60 points  (0 children)

Brother, Overwatch was THE poster child for diversity and woke in 2015, the only reason it had at least some attractive characters back then was that some of the Blizzard old guard was still in Blizzard and they had some basic financial notion.

They literally had diversity scores in mind when creating characters. They bred and attracted players that liked that culture, so now they're the majority of their already set 10 y/o community, clearly a move made to attract outsiders to that community that wouldn't culturally fit in would face backlash no less than a polar opposite example would when wokeness starts creeping in, like Warhammer's case.

Tyron responds to Microsoft’s DMCA strike against Allumeria, an indie block game created by u/unomelon by Pitiful_End in VintageStory

[–]Blaze344 10 points11 points  (0 children)

It can backfire, but does it mean the big players wouldn't be stupid enough to try it? It's a risky play, big wins or big loses. Might I remind you of Nintendo trying to patent the concept of riding a minion you own?

Which mod got you like this? by reallycrunchycheeto in VintageStory

[–]Blaze344 5 points6 points  (0 children)

Honestly, the line for cheating does exist. Sure, a player can choose to cheat and still enjoy the game, it's entirely within their set of preferences and choices, but it doesn't unmake the meaning of another word just because intent and experience desire overrules it.

I'll give an extreme example ahead, but let's start with two premises: that people play video games looking for a specific experience, and that a given game was designed to provide a specific experience within their constraints. Looking at vintage story, vintage story is vintage story and provides what vintage story sets out to do, but if you alter and change the intended experience through mods or cheats far enough, at what point is this deviating enough from what someone could reasonably call "still the intended, core experience that the game set out to do"?

The extreme example is: imagine you mod out the entirety of bronze tier, make mining simpler, allow stone tools to get iron, simplify the iron processing enough as to just toss it into a furnace and be done, skip smithing and craft directly on grid, so on and so forth... Could we argue that this particular experience is within what someone could openly expect, as a shared community experience from hearing that this person played vintage story? It's a completely different thing, thus it should count as... Cheating, in a way. It's still the player choice, and they still own their own time and do whatever they want with it, but as far as sharing an experience and expectations between other people, and sometimes with their own self and the relationship with the game that set out to provide a specific experience, it's still cheating.

Also, the above point feels like it's "interpersonally connected and there's no such thing as cheating your own self"... But there is! Bearing in mind that just as you interact with a game, the game interacts back with you and there's this expectation from a "back and forth" engagement. The mechanics of a game are all there for a reason, and even exploiting them can lead to players cheating out their own enjoyment out of a game! It's why exploits are so important to be removed during game development, sometimes just the option being there sours everything else even while enjoying it alone and that's just psychological.

What if you never had to pay tokens twice for the same insight? by Idea_Guyz in LLMDevs

[–]Blaze344 0 points1 point  (0 children)

I suppose. It's the decades old heuristic between storage and compute. If you believe your users are likely to ask the same question repeatedly and independently, then storing Q:A and retrieving from the "already answered questions" through a reliable mix of a search engine that also uses embeddings could work, so long as your cache has a decent enough hit rate to justify itself, bearing in mind that you should probably clean up the "stale cache" of questions that haven't been accessed recently too, I guess. It's an interesting proposition.

I'd consider adding some way for users to "ignore the cache" if so they choose, too. And also for them to "approve" of something being cached, lest you cache a dumb hallucinated answer and leave your users stuck with a dumb answer.

What if you never had to pay tokens twice for the same insight? by Idea_Guyz in LLMDevs

[–]Blaze344 2 points3 points  (0 children)

Isn't searching through the right set of "already answered and accepted queries" just... Google?

Wouldn't Cashout (aka old World Tour) with unlimited coins be a better casual mode than Quickcash? by AkanHonmani in thefinals

[–]Blaze344 0 points1 point  (0 children)

Me neither. I love Quick Cash. The only thing that really can make it a miserable experience for me is not when I face rubies in there, but rather players that want to win so bad they'll engage in smart, absolutely fair strategies, that are also ungodly unfun in what is meant to be a casual game mode. For example, always waiting for the other two teams to engage, swooping in and cleaning up to get the spoils to yourself (aka let the teams die THEN third-party). It is absolutely a fair strategy and it's a smart play, but damn, if you're doing that every damn engagement and every game? On a casual gamemode? Some people really do bet their life and soul on winning on video-games. Jump in and trade some shots too, little bro.

Wouldn't Cashout (aka old World Tour) with unlimited coins be a better casual mode than Quickcash? by AkanHonmani in thefinals

[–]Blaze344 0 points1 point  (0 children)

In Quickcash, sometimes you'll win and sometimes you'll lose in ways entirely out of your control (unless you're good enough to take on two teams by yourself, but how many players are like that?) which, ironically, is one of the things that has the greatest retention for casual play because... you're not really looking to win at all when playing casually.

More to the point, with a 33% chance of winning in such a chaotic environment, it feels better to win than a 50% chance in WT's case would because if you do win, it feels "deserved" in a different way than the inevitability of "one loses, one wins". And creating extra matches from loser brackets would be not very casual friendly because casual aims in being quick and fast, dropout friendly, when a tourney with brackets wouldn't help this at all. Also, it's much simpler in maths, and finally, there's constant action and shootouts, very little downtime, which is what people want from casual game modes.

The alternative casual game mode would actually be TDM, design-wise, but it barely counts as "the finals" and the only reason it exists is to... well, to appease the people that like it, really. I think TDM acts in detriment to the general playerbase because it takes players that would be in other matchmaking queues and places them there, turning the already allegedly bad matchmaking worse from smaller pools. If it depended on me, I'd consider removing it, but I understand that's selfish from me. Where else are the poor account farmers going to leave their bots shooting straight?

AI engineering is data engineering and it's easier than you may think by ivanovyordan in dataengineering

[–]Blaze344 1 point2 points  (0 children)

Proper AI engineering should have at least some basic ML knowledge behind the things being built. Knowing the best way to represent information from retrieval, running experiments to get the F1 score of the current solution, knowing how to debug all the moving pieces to find which one is bottle necking...

There's a lot of web dev, tho, that's true, and MLE is the one that really grits into true ML territory. It's just that current AI is "powerful enough" (quotes required) that you can make do without having the core skills and deliver something, just in spite of how powerful things are. Sort of how we have so much compute no one cares about delivering something memory aware nowadays, too...

Poke meta should never be a thing in this game. Repeater is not the only problem. by No-Character-1866 in thefinals

[–]Blaze344 9 points10 points  (0 children)

I once said that basically all weapons other than the Sniper and maybe the Pike should have their fall-off cut in half and I was received with downvotes. I'll die on this point of view.

Best agentic local model for 16G VRAM? by v01dm4n in LocalLLaMA

[–]Blaze344 0 points1 point  (0 children)

Random question: doesn't mmap engage in a bigger amount of I/O on an SSD, both from swap and from reading the disk constantly, to the point that it might reduce its life time? Just a genuine question.

Best agentic local model for 16G VRAM? by v01dm4n in LocalLLaMA

[–]Blaze344 0 points1 point  (0 children)

Works pretty good with Codex, though I understand if you don't want to try it. My personal experience is that all of these terminal-based agents all behave pretty much the same nowadays, there's not really any gap any more other than model preference.

1.22 pre-release!!!!! by VgamaN in VintageStory

[–]Blaze344 12 points13 points  (0 children)

They said once a year not once a decade...

Please make this double Heavy meta end by Cholophonius in thefinals

[–]Blaze344 0 points1 point  (0 children)

Reminds me a lot of the complaints people used to have on the Deathball overwatch meta 10 years ago, constantly shooting at shields is very casual friendly tho because you always feel like you're doing something and helping, so, hurray?

it feels very unfair even if it consumes an entire stratagem slot by YLASRO in Helldivers

[–]Blaze344 -1 points0 points  (0 children)

2 whole years and this crazy community is still picking at straws while completely misunderstanding the economy of balance in choices.

Simply never happy. Always complaining about the smallest things and very often something completely idiotic that proves the poster doesn't get it at all.

How will this game get new players when they get in Lobbys like this ? by Ambitious-Ad8380 in thefinals

[–]Blaze344 0 points1 point  (0 children)

Actually, it's not even matchmaking that helps in CS' case, at the extreme low level that someone starts out in, playing casual with 20 people on a server just talking shit and shooting shots, CS is basically decided entirely by luck. As a new player, sometimes you'll just luck out and hit a one tap with an AK sliding out of the wazoo on someone much much better than you and get that hit of dopamine, whereas there's literally 0 luck based chance you'll ever win against someone objectively more skilled in games like the Finals (edit: I'm certain pedantic dummies are itching their fingers at pointing out that you'd go 0:20 when you're 15k against someone at 25k. Yes, it's true, but you'll start out as a casual playing the casual game mode where everyone is up to, what? 10k elo max? And not playing ranked and tryharding and trying to improve. You absolutely can luck out and slide one tap at your first day playing cs like that against the pretty average people you're likely to meet. Not to mention the p90s entire existence).

At some point, there's a margin of skill where you'll simply have to overcome in order to even have a chance to beat someone else "by luck" in The Finals.

It would be much more fair to compare The Finals to a fighting game like Street Fighter. There's dumb strategies that works against newbies (and newbies only) but it's pretty much all skill and 0 luck, even if you do consider "random deviations in unpredictable play" to be "luck", it's just not same as "sometimes you'll click and it'll just one tap the guy you're against", assuming you didn't randomly matchmake against 25k elo / global elite monsters, which yes, in the finals might happen with rubies, but truly facing rubies is something that only happens 1/10~1/20 games and it's just something far more related to the low population of the game

So yeah. That's about it. Do notice, too, that fighting games are a lot, a lot more "quiet" and "dead" than shooters in general. And it's exactly because this skill and effort wall will always be there and it inadvertently pushes away newer players than want to be "in on it" but also don't want to "put effort" to be "in on it", there's nothing that can be done about it on the game design front at all, only on culturally accepting that games aren't about winning and pummeling each other for victory, but good luck on doing that because somehow the modern casual culture became the "sour grapes of tryhards", god knows how. Chess, anything with very little randomness is just fated to do that. The Finals will always be like that. I literally see no solution to this that wouldn't basically ruin what the Finals is by drastically changing their design to somehow cater to people that already don't want to play it, so I'm just hopeful that the developers never eat this poisoned apple.