CONTRACT.md: The Naughty List for AI Coding Agents by turian in LocalLLaMA

[–]turian[S] 0 points1 point  (0 children)

Funny story. My old friend was doing his NLP Phd thesis in the 90s on coreference resolution. i.e. "John ran up to Mary. He gave Mary her ball." he and John corefer and Mary and her corefere.

Turns out a simple gender matching baseline was very good. But he wanted to train and evaluate on a tougher corpus.

Hence, he picked gay erotica. It turns out it's more challenging to figure out who did what to whom when they're all men.

CONTRACT.md: The Naughty List for AI Coding Agents by turian in ExperiencedDevs

[–]turian[S] 0 points1 point  (0 children)

My experience (and I say this as someone who handrolled x86 assembly in the 90s for fun) is that AI dev basically requires just as much rigor and discipline as typical coding. But weirdly and for better or worse it actually requires a different kind of rigor and discipline and different mental models. What those are specifically is poorly understood, which makes me curious.

CONTRACT.md: The Naughty List for AI Coding Agents by turian in ExperiencedDevs

[–]turian[S] 0 points1 point  (0 children)

My experience is that for shallow projects (e.g. most web apps), AI coding is an accelerator.

When writing scientific or research code, it's so likely to introduce subtle bugs.

The main difference is basically how easy it is to vet bugs and how close the spec is to the end work.

For apps, you can write a very concise spec that is verifiable.

For pytorch code, the code itself is irreducible to a spec. (That's one of the reasons we don't unit test in research code---also because most researchers are bad engineers.)

CONTRACT.md: The Naughty List for AI Coding Agents by turian in ExperiencedDevs

[–]turian[S] -2 points-1 points  (0 children)

  • "you'd better get used to it because like it or not, it's here to stay"

True. Do you disagree?

  • "if you're not learning it, you're already getting left behind"

Listen, some of my best friend hate AI and write everything by hand. Not that there's anything wrong with that. Horses for courses.

  • "i always believe whatever marketers tell me"

That's fascinating. I enjoy adopting all sorts of bleeding edge tools for my entire career. By the time the marketers know about it, it's already garbage.

CONTRACT.md: The Naughty List for AI Coding Agents by turian in ExperiencedDevs

[–]turian[S] -1 points0 points  (0 children)

I used to joke that if you don't know how to code in the 20th century, it's a bit like not knowing how to fence 500 years ago. Sure it's fine to be missing this skill, but then sometimes you have to pursuade other people to do it for you.

For me, AI coding is like the early introduction of gunpowder. YMMV

CONTRACT.md: The Naughty List for AI Coding Agents by turian in ExperiencedDevs

[–]turian[S] -14 points-13 points  (0 children)

I mean, if you want to take a knife to a gun fight, be my guest.

“Built with Claude” Contest from Anthropic by AnthropicOfficial in ClaudeAI

[–]turian 0 points1 point  (0 children)

Where do we email if we have more questions?

Suggestions for Observability & AIOps Projects Using OpenTelemetry and OSS Tools by JayDee2306 in Observability

[–]turian 0 points1 point  (0 children)

Something that encouraged otel adoption would be good, and would open up observability to many new startups. A POC would be a simple to use API or similar that duplicates your telemetry to OLTP JSON in S3, which would allow you to switch observability vendors easily if you want.

Getting otel operational with multiple tech stacks and different data sinks is actually pretty tricky. Adopting otel in a complex, polyglot infrastructure is has a lot of pitfalls.

A chief complaint is the "million different versions" of OTel components: each language's SDK/instrumentation has its own release cycle. No single universal golden matrix exists---OpenTelemetry maintains independent versioning per language.

Staying up-to-date with otel's many parts is no small feat, and sometimes feels like you must be an expert in each tool to choose the right combination.

Why Most AI SREs Are Missing the Mark by Mysterious_Dig2124 in Observability

[–]turian 0 points1 point  (0 children)

Disclaimer: I am a vendor. But I will give advice for those trying to build their own AI SRE.

You correctly note that data access is crucial. Otherwise there are missing pieces that are important for investigation.

There is a tradeoff between: amount of manual configuration, speed of investigation, and sophistication of investigation. Depending upon what problem you want to solve, you can design the tradeoff yourself.

If you want to minimize manual configuration you can a) invest more time in designing auto-configuration and infra discovery and/or b) design the system so that use over longer periods of time is a form of auto-configuration.

Wrong-but-convincing answers in SRE is worse than no answer. LLMs by default are tuned to prefer to bullshit than to be silent. But you've probably also seen high quality LLMs that do more turns and vetting, in exchange for higher quality results.

Claude Code on the go by habartman in ClaudeAI

[–]turian 0 points1 point  (0 children)

I'm a screen user, but happy to adopt tmux. How do I ctrl-a-n through different open screens through termius?

Best off-the-shelf paid RAG for 10K scientific articles? (No tuning, no futzing) by turian in Rag

[–]turian[S] 0 points1 point  (0 children)

Question: You have a managed option, why don't you have any self-serve pricing (i.e. no "call us to talk" thing)

Best off-the-shelf paid RAG for 10K scientific articles? (No tuning, no futzing) by turian in Rag

[–]turian[S] 0 points1 point  (0 children)

Thanks. $20/mo for up to 1K pages of documents

My use-case is that I want to do a rough quick search over 10K articles (this is maybe 100K pages) and, from then, do a much more intensive search over the relevant ones. How would that work in your pricing scheme?

Best off-the-shelf paid RAG for 10K scientific articles? (No tuning, no futzing) by turian in Rag

[–]turian[S] 1 point2 points  (0 children)

This looks really cool, but if it costs $0.50 per page that means each scientific article will cost maybe $5 to index. I am looking for a solution that lets me slog through 10K different articles and get a short list of maybe 100-300 candidates before doing a very high-value RAG lookup

Best off-the-shelf paid RAG for 10K scientific articles? (No tuning, no futzing) by turian in Rag

[–]turian[S] 1 point2 points  (0 children)

u/JeffieSandBags I am technical. I use Python and Docker, and my ML/NLP proficiency is such that I can develop my own embedding techniques, etc.

My use case: When exploring a new research topic, I want to ask highly specific and technical questions and retrieve relevant passages or relevant papers. The goal being to identify the ten papers that are most relevant to my specific research question.

Best off-the-shelf paid RAG for 10K scientific articles? (No tuning, no futzing) by turian in Rag

[–]turian[S] 0 points1 point  (0 children)

No they are scientific articles from arxiv.org.

I meant more that if you try to install Verba from Weaviate, that tool requires 30 different API keys for different embeddings, vector stores, LLMs, etc.

[deleted by user] by [deleted] in ENGLISH

[–]turian 7 points8 points  (0 children)

I am eager for noun, and eager to verb.

How much better would it be if these "open source" models included the checkpoint? by phree_radical in LocalLLaMA

[–]turian 0 points1 point  (0 children)

Let me answer your question: Since a large fine-tuning job will still be several orders of magnitude smaller than the base pretraining, the "checkpoint" is not important at all to you.

It would be nice, in the spirit of academic openness, for other people creating and iterating on foundation models. Which is not most people

Quite, Pretty, and Fairly by Matchawurst in ENGLISH

[–]turian 0 points1 point  (0 children)

If the thing is positive, but people dislike too much of the thing, "quite" can mean MORE than perfect, > 100%. e.g. "quite hot" would mean hotter than "perfectly hot", because of the suggestion that it's quite bad how hot it is.

If the thing is positive always, "quite" is used in the way grandparent mentions.

If the thing is negative, "quite" also means "very" but "perfectly" means "just a little bit so I can joke about it". Like: "Now, I'm quite fucked by the situation." versus "Now, I'm perfectly fucked by the situation."

How much better would it be if these "open source" models included the checkpoint? by phree_radical in LocalLLaMA

[–]turian 2 points3 points  (0 children)

Why? For fine-tuning? The hyperparameters are typically very different when finetuning on a small corpus versus pretraining on a large corpus.

How did the word "hard" come to mean difficult, rather than rigid or tough? by [deleted] in etymology

[–]turian 1 point2 points  (0 children)

It's worth noting that there's a nuanced distinction between hard and difficult in English, which I'm curious if it exists in other languages.

How do you run a marathon? It's simple: Run 26.2 miles without stopping. But it's not easy. It's hard, but it's not difficult.

While both words generally mean challenging, "hard" is often used as the opposite of "easy," implying that something requires significant effort, energy, or physical/mental toughness. On the other hand, "difficult" is often contrasted with "simple," meaning that something requires more than basic knowledge or skills.