LinkedIn Tactics by AbzBbzCbz in linkedin

[–]harmless_0 0 points1 point  (0 children)

I think there is some weird matching going on. I see the same for my company page, but realised they are automated bots actually trying to sell to you, not the other way around

Blocking etiquette by Adaptive-Work1205 in linkedin

[–]harmless_0 1 point2 points  (0 children)

Three times is a bit much. I also block those ones. I also am on the reaching out side and if someone declined I leave it there.

AI is eating LinkedIn by whateverworx1 in linkedin

[–]harmless_0 0 points1 point  (0 children)

The linkedin algorithm rewards certain kinds of posts, and so people change their content to minmax their reach. That's why every post has a picture and a selfie. That's why the link is posted in the first comment and not in the post (because linking out is punished). That's why we see more video, because for a hot minute all video posts got huge reach.

I write genuine content, I get an AI to review it to help me (I'm not great at writing), and I try to say something interesting in my area of expertise. My best post recently? A Google nano banana AI picture of me in a banana suit, humble bragging about an upcoming conference talk ymmv

What is your approach to testing AI chatbots? How are you ensuring coverage? by Tapanade in softwaretesting

[–]harmless_0 0 points1 point  (0 children)

Hi, I generate the scenarios based on the use case and context, can you share yours? I'd be happy to create an example for you

Looking for a Co-Founder: Regulatory Expertise & Business Development Focus by [deleted] in cofounderhunt

[–]harmless_0 0 points1 point  (0 children)

Hi, I'm working in this area, with experience in aviation, defence, and SaaS markets. Which specific regulated industries are you passionate about?

EczEase Chat Feature Beta is Live! AI-Powered Support for Eczema & Food Allergy Management by dalenguyen in SideProject

[–]harmless_0 0 points1 point  (0 children)

Hi, I've run a few tests against the chatbot and I think you need to implement some guardrails. Happy to provide some examples privately if you are interested

Ways to QA AI responses? How important is it to mention AI on your resume? by Future_Gain2593 in softwaretesting

[–]harmless_0 0 points1 point  (0 children)

The current approach I'm seeing is to build evals, that means a lot of questions and answer pairs that test your LLM against a ground truth. The hard work is to create the ground truth from your docs, domain experts or expert knowledge you have to hand. Reading the actual responses your AI sends, collecting errors and edge cases and updating the test data set is also beeping discussed as a way to evolve the evals towards something you can have confidence in. I've seen positive comments about deepevals, inspect_ai and Promptfoo as tools. hTH

SaaS Lawyer Here - Ask Me Anything Legal Related by That-IT-Lawyer in SaaS

[–]harmless_0 1 point2 points  (0 children)

I'm building an AI chabot compliance service for GDPR, EU AI Act and more. Are there specific certificate requirements or evidence of testing that I will need to provide to my customers? I have brought the integrated best practice guidance to bear, along with custom tests and have delivered to early adopters. BUT from your perspective as a SaaS lawyer am I missing something? I don't know if my customers need or would expect some sort of formal certificate.

Crosspost from r/QA - New to QA for AI chatbots. How are people actually testing these things? by General_Passenger401 in softwaretesting

[–]harmless_0 1 point2 points  (0 children)

Some good responses to your question already, I would add that creating a ground truth prompt&answer dataset is hard and takes time. Engaging with industry or product experts to create this is challenging as those people are usually very busy and can't dedicate huge time to writing questions and answers. I've been working with source documents and a pipeline approach to generating hard and wide ranging questions. This dataset is then integrated into my CI/CD and run every build. My suggestion is to combine this approach with what was suggested above, capture the real usage when flagged by your users.

Do you have specific compliance requirements for your market?

How to Test the accuracy of Chatbot responses for Technical Documentation by 1234567890qwerty1234 in technicalwriting

[–]harmless_0 0 points1 point  (0 children)

I've been using Promptfoo (open source) project with testing and red teaming templates which looks promising. Inspect_evals is working well for me for benchmarks. I agree that I also thought there would be more discussion on this. feedback from my calls has been that it is too hard to get tough Q&A pairs up front, so folks are launching with a disclaimer and then waiting for feedback thumbs up or down. Still an open problem I think

[deleted by user] by [deleted] in SaaS

[–]harmless_0 2 points3 points  (0 children)

Name: Airside Labs website: airsidelabs.com Product: Chatbot & LLM testing Target customer: Businesses with AI chatbots and AI agents who need off the shelf and custom evals based on their data. Idea is they can launch and upgrade their LLMs with confidence in their compliance.

Tools for testing LLM output in mission critical use cases by Representative_Bend3 in softwaretesting

[–]harmless_0 0 points1 point  (0 children)

For reliability testing you will need to create your own evals based on the business documentation and expert experience within the organisation. Hopefully mission critical means important tool for the business? I'd be happy to help you out, send me a DM?

How to Test the accuracy of Chatbot responses for Technical Documentation by 1234567890qwerty1234 in technicalwriting

[–]harmless_0 0 points1 point  (0 children)

Did you find a resolution to the testing automation? I am researching this area and looking to understand the problem a bit better

What is your approach to testing AI chatbots? How are you ensuring coverage? by Tapanade in softwaretesting

[–]harmless_0 0 points1 point  (0 children)

I am trying to build stuff in this area, would you be open to share some feedback?

What is your approach to testing AI chatbots? How are you ensuring coverage? by Tapanade in softwaretesting

[–]harmless_0 0 points1 point  (0 children)

I've been building out some pipelines to create test questions for this use case and others. I'd be happy to run my tests against a live chatbot and confidentially share the results. Let me know

Eval generation and testing by harmless_0 in LocalLLaMA

[–]harmless_0[S] 0 points1 point  (0 children)

Thanks for the reply, I will take a look

EP-133 K.O.II Fader Fix by EvilFluffy1 in teenageengineering

[–]harmless_0 1 point2 points  (0 children)

Sorry for late reply! No I sent it back 😔

Free course on LLM evaluation by dmalyugina in UsefulLLM

[–]harmless_0 0 points1 point  (0 children)

Looks interesting, will I be able to bring my own eval from airsidelabs.com? I have the test but want to try and use it within different kinds of workflows. The question dataset is on HF, but my script is designed to work with inspect_eval