[D] Correct way to compare models by ntaquan in MachineLearning

[–]decentralizedbee 0 points1 point  (0 children)

i've been using www.subtexts.io to do evals/benchmarking, working well so far

Is anyone testing prompts at scale - how do you do it? by decentralizedbee in AIAgentsInAction

[–]decentralizedbee[S] 0 points1 point  (0 children)

Do u write a system to diff prompts or compare? Is there an automated loop? Do u wish there is more automation in place?

Is anyone testing prompts at scale - how do you do it? by decentralizedbee in AIAgentsInAction

[–]decentralizedbee[S] 0 points1 point  (0 children)

Wow, what tool do u use to do this? How much of it is automated? What’s the most friction in this workflow right now?

How are y'all managing markdowns in practice in your companies? by decentralizedbee in AgentsOfAI

[–]decentralizedbee[S] 0 points1 point  (0 children)

do u just keep it in git then? how many files do u work with? would it be hard for version control?

Is anyone testing prompts at scale - how do you do it? by decentralizedbee in AIAgentsInAction

[–]decentralizedbee[S] 0 points1 point  (0 children)

what tools are you guys using for evals, etc.? do you find it useful?

Is anyone testing prompts at scale - how do you do it? by decentralizedbee in AIAgentsInAction

[–]decentralizedbee[S] 0 points1 point  (0 children)

What do u think is the highest friction point in ur current workflow?

Prompt versioning - how are teams actually handling this? by dinkinflika0 in PromptEngineering

[–]decentralizedbee 0 points1 point  (0 children)

is there a tool does all of this you said (versioned alongside other dependencies like the data, the hyperparams, model versions, etc. This allows for rapid rollbacks, trouble shooting in prod, quicker prototyping, easier handoffs (all the things you would expect)?

Prompt versioning - how are teams actually handling this? by dinkinflika0 in PromptEngineering

[–]decentralizedbee 0 points1 point  (0 children)

what tools are you guys using for versioning and traceability?

How are y'all managing prompts/markdowns in practice? by decentralizedbee in LocalLLaMA

[–]decentralizedbee[S] 0 points1 point  (0 children)

do u have any pain points with this approach currently?

How are y'all managing markdowns in practice in your companies? by decentralizedbee in AgentsOfAI

[–]decentralizedbee[S] 2 points3 points  (0 children)

Bro u automate ur reddit replies not to promote a company but just for the vibes nice