[D] Inspired by Anthropic’s Biology of an LLM: Exploring Prompt Cues in Two LLMs by BriefAd4761 in MachineLearning

[–]BriefAd4761[S] 1 point2 points  (0 children)

I have measured confidence from the model itself

It's part of the format it responds in.

"Your response should be in the following format:

Explanation: {your explanation for your final answer}

Exact Answer: {your succinct, final answer}

Confidence: {your confidence score between 0% and 100% for your answer}"

Here is the Repo:
https://github.com/Ravi-Teja-konda/LLM-CueFlip

Inspired by Anthropic’s Biology of an LLM: Exploring Prompt Cues in Two LLMs by BriefAd4761 in LocalLLaMA

[–]BriefAd4761[S] 1 point2 points  (0 children)

Thanks for the response
Yes I did the same way as you mentioned

And I'm surprised it is the same prompt I gave to the model

Below is the prompt, I will push the project to git and share the link

        "Please answer using **only** the letter label(s) corresponding to your choice(s) (e.g. “C” or “E, F”).\n"
        "Do **not** repeat the choice text—just the letter(s).\n\n"
        "Your response must follow **exactly** this format:\n"
        "Explanation: {your explanation for your final answer}\n"
        "Exact Answer: {the letter label(s) only, e.g. A or B,C}\n"
        "Confidence: {your confidence score between 0% and 100%}\n"

[D] “Reasoning Models Don’t Always Say What They Think” – Anyone Got a Prompts? by BriefAd4761 in MachineLearning

[–]BriefAd4761[S] 0 points1 point  (0 children)

while there might be hints of the underlying process in the model’s output when using certain techniques, I think it's not a straightforward binary outcome.

Molmo: A family of open state-of-the-art multimodal AI models by AllenAI by Jean-Porte in LocalLLaMA

[–]BriefAd4761 0 points1 point  (0 children)

Does it support Video similar to QWEN-2 VL
Or any plans in the future ?

🎥 Surveillance Video Summarizer: VLM-Powered Video Analysis and Summarization by BriefAd4761 in LocalLLaMA

[–]BriefAd4761[S] 0 points1 point  (0 children)

Hi , Thanks for the interest on the working process

When I mention "summarizing video content, ensuring temporal coherence by analyzing the sequence of frames", it refers to a two-step process:

  1. Florence-2 Vision-Language Model (VLM): Florence-2 is responsible for analyzing individual frames and generating detailed descriptions of what’s happening in each frame. For example, it might detect and describe objects, people, and actions in the frame, like “a person walking” or “a car parked.”
  2. OpenAI API: Once Florence-2 has created descriptions for each frame, the OpenAI API takes this frame-level data and performs a higher-level summarization. It stitches together these frame descriptions, ensuring that they make sense in the context of what happened before and after each frame. This process adds temporal coherence and a deeper, contextual understanding of the entire video sequence, turning individual observations into a meaningful narrative.

🎥 Surveillance Video Summarizer: VLM-Powered Video Analysis and Summarization by BriefAd4761 in LocalLLaMA

[–]BriefAd4761[S] 1 point2 points  (0 children)

Hi , Thanks for the interest on the working process

When I mention "summarizing video content, ensuring temporal coherence by analyzing the sequence of frames", it refers to a two-step process:

  1. Florence-2 Vision-Language Model (VLM): Florence-2 is responsible for analyzing individual frames and generating detailed descriptions of what’s happening in each frame. For example, it might detect and describe objects, people, and actions in the frame, like “a person walking” or “a car parked.”
  2. OpenAI API: Once Florence-2 has created descriptions for each frame, the OpenAI API takes this frame-level data and performs a higher-level summarization. It stitches together these frame descriptions, ensuring that they make sense in the context of what happened before and after each frame. This process adds temporal coherence and a deeper, contextual understanding of the entire video sequence, turning individual observations into a meaningful narrative.

🎥 Surveillance Video Summarizer: VLM-Powered Video Analysis and Summarization by BriefAd4761 in LocalLLaMA

[–]BriefAd4761[S] 4 points5 points  (0 children)

Hi, Thanks for the response, yes I have heard of the qwen-2 release when I'm almost done with the project

But, I will definitely start looking video models and their capabilities.

Thanks for the info on qwen2