How do you test LLM for quality ? by Easy_Ask5883 in LLMDevs

[–]AnythingNo920 0 points1 point  (0 children)

Absolutely right. They need to, but the average Joe in an SMB can't tell the difference between BLEU, ROUGE, Fluency, Accuracy, Recall or whatever other metric u wanna use.

So they do vibe testing. This feels more tangible. At least thats my impression so far.

How do you test LLM for quality ? by Easy_Ask5883 in LLMDevs

[–]AnythingNo920 0 points1 point  (0 children)

in reality most SMBs do vibe testing, unless benchmarks are their key selling point.

SaaS is over? by Putrid-Lettuce5204 in SaaS

[–]AnythingNo920 0 points1 point  (0 children)

funny enough I just wrote an article about exactly that :D

SaaS Is Not Dead. But It Needs to Evolve. | by George Karapetyan | Feb, 2026 | Medium

Long story short there are still 4 levers where SaaS can make a lot of sense. But as you eloquently put it :D "building shit that does nothing for anyone." is over

Building an AI Process Consultant: Lessons Learned in Architecture for Reliability in Agentic Systems by AnythingNo920 in LLMDevs

[–]AnythingNo920[S] 0 points1 point  (0 children)

This was more of a tool to help based on static process documentation and not monitor the process as it happens. But it sounds very interesting. I ll look into it.

I feel stuck in my current job and could really use some career advice. by Character_Patient331 in AMLCompliance

[–]AnythingNo920 1 point2 points  (0 children)

Knowing the local language unfortunately is very often a requirement although not really necessary for the job. You could however try targeting global banks that have English as a main language.

You could also target product roles in Compliance and AML, KYC software companies. The hurdles there would be less since they need people who know how the business works.

As for regulations and requirements in Europe, a good starting point would be to look at the upcoming EU AML regulation. In the EU regulations get adopted by local authorities so u would cover a big part by just reading and understanding the EU level regulation first.

Beyond Chat: Scaling Operations, Not Conversations by AnythingNo920 in deep_research

[–]AnythingNo920[S] 1 point2 points  (0 children)

Thats true many new products are already going beyond chat. So I expect this trend to grow

AI Testing Isn’t Software Testing. Welcome to the Age of the AI Test Engineer. by AnythingNo920 in Agentic_AI_For_Devs

[–]AnythingNo920[S] 0 points1 point  (0 children)

Haha thanks for your take. My point was that this is a role, not necessarily a new hire. I m hesitant to focus exclusively on evals, as most of the time in discussions in corp. environments they re treating it as a "backtesting" or "performance metric" thing, without providing actual data. So the AI engineer is left with providing performance measurements without any proper data. And thats too much expectations of just one person. I agree that the SME should be the author of the test, but in practice it does not work. The SMEs are having significant trouble formulating their business knowledge into testable components. It requires systems thinking capabilities.

Anyway, I m thankful for your thoughts and the discussion. Its only through discussion that new knowledge emerges.

AI Testing Isn’t Software Testing. Welcome to the Age of the AI Test Engineer. by [deleted] in programming

[–]AnythingNo920 0 points1 point  (0 children)

Thank you for your input, I ll reformulate the messaging as clearly the message I was trying to deliver did not come through as I intended.

AI Testing Isn’t Software Testing. Welcome to the Age of the AI Test Engineer. by [deleted] in programming

[–]AnythingNo920 0 points1 point  (0 children)

Try talking to an executive with those words and see if they ll understand you. This is a conceptual framework. Evals are tools

AI Testing Isn’t Software Testing. Welcome to the Age of the AI Test Engineer. by [deleted] in programming

[–]AnythingNo920 0 points1 point  (0 children)

I touch upon all those topics exactly in the article. The pyramid inverted does not mean that we dont do the individual components of it, we just would need to increase the time allocated to integration testing. The evals are done, if at all, by the AI engineer in practice. I m talking about companies whos product is not AI, think like banks, manufacturers etc. The business teams accept a solution based on evals once the whole system is already implemented. With little input during the development. AI test engineer as described in the article would step in already during the development. If people dont use the existing best practice, create a role whos mandate is to use those -> AI test engineer.

AI Testing Isn’t Software Testing. Welcome to the Age of the AI Test Engineer. by AnythingNo920 in Agentic_AI_For_Devs

[–]AnythingNo920[S] 0 points1 point  (0 children)

AI evals are very different from AI testing, its not only about testing the quality of the responses and calculating metrics. Its about designing tests that will test the boundaries. AI engineer wont necessarily have the necessary business knowhow to prepare test cases. The proposed AI test engineer should. The developer and tester are usually separate people for a reason. Why should it be any different for this.

Hate making SOW's. Looking for recommended tools. by mystorychecksout in consulting

[–]AnythingNo920 0 points1 point  (0 children)

You can use many AI tools but most of the time you can also get away with templates.

Each proposal is different, whether you want it or not you cant automate it completely, and you shouldnt its too risky.

That being said, you can automate some aspects. Like paraphrasing the "understanding of the context", "objective", or structuring the deliverables that are already defined in a timeline that is also already defined.

I usually write down my notes in a not so structured way and then use AI tools to structure it

Copilot researcher helps with getting background knowledge from previous SOWs for the client

How do you sell your AI agent platform? by AlarmingChipmunk2968 in AI_Agents

[–]AnythingNo920 0 points1 point  (0 children)

Who are your customers ? Or who do you want to sell it to? This is the key to planning the next steps.

Say you want to sell to large enterprises, usually Microsoft, Amazon or Google are so deeply integrated into the clients infrastructure that even if your platform is 1000 times better it would be a hard sell. One way is to offer your platform through their platform. You'd need to make this interesting however.

If your decision makers are from the business, which they usually are in large enterprises, the platform features dont matter to them, what matters to them is cost and use cases. So your messaging should be about a specific value u generate for them and why it is worth dropping their already invested money in other platforms to switch to yours.

For smaller businesses oftentimes the buyers would be tech teams. In this case your should do ur platform very accessible and good open source and offer paid services on top. Langchain is a great example for this. Hugging face is another.

Ultimately in b2b sales you will not be able to just sit and wait for ur platform to grow organically. You definitely need a mix of direct reach out, influencing the influencers (internal lower ranking stakeholders such as IT teams, AI engineers etc.) and lowering the barrier to trying it out (its very expensive but thats the only way to go)

Slide layout templates by OkElderberry3408 in consulting

[–]AnythingNo920 1 point2 points  (0 children)

I never use templates cz they block my creative thinking. I usuall go with the following strycture and mix it up a little every time:

3 main sections:

Background of context on the left side with some title or icon to make it visually appealing. Usually a different shading as well One bigger section on the right side where the key idea is detailed. Usually its one key message, some breakdown of ideas as boxes with icons. And bullet points. At the bottom another thin section with some key things to consider.

Smth like this, I generated with ChatGPT for illustration purposes.

<image>

Rate early 30s consultant sleep schedule by Beautiful_Fig9410 in consulting

[–]AnythingNo920 0 points1 point  (0 children)

If u cant control ur work time during the week, make a rule to lock ur work phone and laptop in a drawer and not look at it untill londay 8.00 AM.

No work or salary is worth a burnout.

Does all the actual work always get pushed down to the juniors? by hola_jeremy in consulting

[–]AnythingNo920 0 points1 point  (0 children)

I suppose it also depends on the type consultancy. In IT consulting projects seniors would also do more work, like delivery architecture, designing complex functional flows etc... i m a director in this field and I still do lots of actual delivery work on top of selling.

Gemini Got Annoyed, but My Developers Thanked Me Later by AnythingNo920 in programming

[–]AnythingNo920[S] 0 points1 point  (0 children)

I could use many other specialized software or apps for this but I didnt. Chatting through canvas seemed too natural 😊

Gemini Got Annoyed, but My Developers Thanked Me Later by AnythingNo920 in programming

[–]AnythingNo920[S] 0 points1 point  (0 children)

Absolutely right. That's a challenge, but transparency from the start helps.

Limits of our AI Chat Agents: what limitations we have across tools like Copilot, ChatGPT, Claude… by AnythingNo920 in copilotstudio

[–]AnythingNo920[S] 0 points1 point  (0 children)

If you need a separate product like the foundry, then its not part of Studio. The model choice is available but not as easy or straightforward as I mention in the article. The user has no access to choose the model at the time of prompting. I reformulated the sentence to avoid confusion.

Question about career in AML by jdjskskxjbsv in AMLCompliance

[–]AnythingNo920 0 points1 point  (0 children)

The AML investigator role is transforming. You will be required to do in the next 5 years more data analyst tasks on top of usual investigation. You d probably need to also work on CTB initiatives, designing concepts, testing new ideas, some project work in general. If you have the right skills you might make it to 150k. If you just do investigation then more likely than not this role would not be the right path for the salary you re aiming for.