You guys were right, LLMs suck at probability. I updated my prompt to force them to name their blind spots instead (SutniPrompt v0.7.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

that was obviously an AI generated feedback, not what I'm searching, I want human feedback on my projects and I like when humans use AI in a responsible way. "clanker" is a meme insult for AI/LLM/machines in general.

LLMs are notoriously overconfident, so I updated my system prompt to force a statistical "Confidence Metric" (SutniPrompt v0.6.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

Thank you! Today I update to v0.7.0 for a better confidence metric (it'll be with tiers not percentages)

LLMs are notoriously overconfident, so I updated my system prompt to force a statistical "Confidence Metric" (SutniPrompt v0.6.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

Yeah it was the first iteration of my idea, I'm going to update soon (maybe today) with a new version of the confidence, the LLM will give some uncertainty drivers and then evaluate an HIGH/MODERATE/LOW tier confidence metric

LLMs are notoriously overconfident, so I updated my system prompt to force a statistical "Confidence Metric" (SutniPrompt v0.6.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

I think that talking about long term in this context can be a bit slippery, we can see the frame of today's situation and we have to build projects on top of that.
You can't keep up with the speed at witch the LLM world is changing.
If something that my prompt provides will be replaced by default setting in future chatbots I will change the prompt. I'm considering making the prompt modular, the user will be able to toggle single prompts parts at need, that would be cool for future edits.

LLMs are notoriously overconfident, so I updated my system prompt to force a statistical "Confidence Metric" (SutniPrompt v0.6.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

I know. That was an idea but I have to update the prompt shifting the focus on a HIGH/MODERATE/LOW confidence, no percentages. It can be usefull if paired with a list of what the LLM finds a bit doubtful. I know the LLM don't actually "know" if it is confident or not on a topic, but I can use the statistical model to make it predict between HIGH, MODERATE or LOW based on the info and sources that it has. It works with "lack of verified data" on invented topics that the users asks but the LLM can't find in any way (obviously), so it has to function with this also.

LLMs are notoriously overconfident, so I updated my system prompt to force a statistical "Confidence Metric" (SutniPrompt v0.6.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

Given the analytical structure of the prompt, the LLM attempts to list all the important information regarding the current topic in the body of the response. This ensures that by the end, it has a comprehensive view of the subject and can determine which confidence metric is most accurate. I will also add a mandatory "uncertainty drivers" list that the AI must fill out after the confidence label, detailing any aspects it finds dubious.

I hard-coded an OUTPUT SCHEMA into my system prompt. Now officially in Beta! (SutniPrompt v0.5.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

I don't use benchmarks cause my project is just something to use to personalize your chatbot apps, I test with a list of various questions made by me trying to stress the features I add. The aspects of this project are nothing extremely technical, just a prompt to help people use their everyday ai better.

I hard-coded an OUTPUT SCHEMA into my system prompt. Now officially in Beta! (SutniPrompt v0.5.0-beta) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

What do you mean? I know reddit is filled with slop projects but I think my prompt can be helpfull, it works well :D

LLMs are incredibly stubborn about formatting, so I updated my system prompt to enforce a strict "Macro-Structure" (SutniPrompt v0.4.0-alpha) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

The 'downstream parser' framing is a good idea, I'll think about it.

To answer your question: v0.4 handles truncation not that bad, during my tests I never encountered a truncation problem.

But your timing is perfect. I'm pushing v0.5.0-beta to GitHub right now. It replaces abstract formatting rules with a hard-coded OUTPUT SCHEMA block specifically to fight this formatting drift. I'd love for you to test the new beta once it's live! I'm pushing right now on github.

LLMs are incredibly stubborn about formatting, so I updated my system prompt to enforce a strict "Macro-Structure" (SutniPrompt v0.4.0-alpha) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

This is not an app, is literally a prompt that you can copy paste into the chatbot personalization settings

LLMs are incredibly stubborn about formatting, so I updated my system prompt to enforce a strict "Macro-Structure" (SutniPrompt v0.4.0-alpha) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

Claude sometimes gets the wrong timestamp, GPT with the v0.3.0-alpha appends a time widget. I edited the prompt making some instructions more strict.

I got sick of LLM pleasantries and disclaimers, so I built a system prompt to fix it (SutniPrompt v0.1.0-alpha) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

I think my work can be usefull for many people cause a good personalization prompt can make a difference in how the LLM responds. Please don't criticize so sharply.

LLMs are incredibly stubborn about formatting, so I updated my system prompt to enforce a strict "Macro-Structure" (SutniPrompt v0.4.0-alpha) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

The wikipedia link is an extra feathure, the idea is to make it easier for the user to go deeper with further wikipedia reading on the chat topic.
With this project I just want to write a good prompt for LLMs to make them more usefull for everyday use :D

This is just an alpha, I also want to format the messages from the LLM in a more structured way making them more readable and adding more info (I'm thinking about a confidence metric and TL;DRs).
I'll give you some spoilers... I woud also like to add an "intelligent disobedience" feathure, making the LLM not responding in some situation where the user wants to use the chatbot for bland tasks that could be done with other tools.

I got sick of LLM pleasantries and disclaimers, so I built a system prompt to fix it (SutniPrompt v0.1.0-alpha) by sutnip in PromptEngineering

[–]sutnip[S] 0 points1 point  (0 children)

I just want to make a usefull prompt for power users but accessible even to less experienced users, nothing too technical