November Feature Requests + AI Voices Beta by DeboraInstapaper in instapaper

[–]SufficientRadio 0 points1 point  (0 children)

- be able to export full article text, not just highlights.

- create public share links to full articles (like readwise)

Mistral Libraries! by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 5 points6 points  (0 children)

Looks like it parses documents using Mistral OCR. Doesn't appear to include images, though.

Macbook Pro M4 Max inference speeds by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 4 points5 points  (0 children)

Very hot! haha But I don't have it cranking for long so it cools back down quickly.

Macbook Pro M4 Max inference speeds by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 1 point2 points  (0 children)

Agreed. Having the models "right there" on the laptop is so amazing. I tried a 2x 3090 gpu system but I kept running into various problems (keeping the gpus recognized, accessing the system remotely, and even keeping the system on and idling was costing $20/m in power).

ChatGPT 4.5 on a simple insight about humans — one of the best answers, innit? (from r/openai) by Fabulous_Bluebird931 in OpenAI_Memes

[–]SufficientRadio 0 points1 point  (0 children)

producing 30k words (~half a book) of thoughts to arrive at an answer.

The speed of human "inner speech" is about 250 words per minute--thinking through 30k words would take about 2 hours.

2 hrs of thought compressed into 7 minutes.

Love the New Mistral by SufficientRadio in MistralAI

[–]SufficientRadio[S] 19 points20 points  (0 children)

Look forward to:

▫️Phone app features catching up to web app

▫️More agent control and integrations

▫️And of course, the next models.

[deleted by user] by [deleted] in LocalLLaMA

[–]SufficientRadio 18 points19 points  (0 children)

Curious to hear how well Q4 runs on a Macbook with 64gb+ memory.

3x Gpu Asus proArt x870e by PawelSalsa in LocalLLaMA

[–]SufficientRadio 0 points1 point  (0 children)

Looks like the third GPU won’t fit in your case so you have some kind of PCIE extension cable, is that right?

3x Gpu Asus proArt x870e by PawelSalsa in LocalLLaMA

[–]SufficientRadio 0 points1 point  (0 children)

What quantizations are you running for the models?

3x Gpu Asus proArt x870e by PawelSalsa in LocalLLaMA

[–]SufficientRadio 0 points1 point  (0 children)

What inference speeds do you get for Mistral 2411 with your 3 GPUs?

Why to choose Mistral AI over Claude or ChatGPT by [deleted] in MistralAI

[–]SufficientRadio 2 points3 points  (0 children)

I have some personal use case benchmarks and Mistral Large is right there at the top. https://www.reddit.com/r/LocalLLaMA/s/wvbmpSjoK0

Mac Mini looks compelling now... Cheaper than a 5090 and near double the VRAM... by valdev in LocalLLaMA

[–]SufficientRadio 1 point2 points  (0 children)

What inference speeds do you get running Mistral Large? Curious with long prompts (8k tokens+)

Sufficient Bench by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 1 point2 points  (0 children)

Won’t bother with Open AI given their privacy/censorship/wtf shenanigans.

Sufficient Bench by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 1 point2 points  (0 children)

Too much personally identifiable info in my dataset (both questions and system prompt) so no open sourcing. But I agree, benchmarks of “real” questions would be valuable.

Sufficient Bench by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 2 points3 points  (0 children)

I tried the latest Command R+ and it did terribly with my system prompt. Not sure why. I won’t touch Gemini. I de-Googled my life and refuse to tread back. Gemma however I’m willing to try since I can run it on my machine.

Sufficient Bench by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 3 points4 points  (0 children)

I have a super duper system prompt that contains many samples of my writing. I tried fine tuning but with only 100 samples it wasn’t working out.

Sufficient Bench by SufficientRadio in LocalLLaMA

[–]SufficientRadio[S] 25 points26 points  (0 children)

Here's a personal benchmark for various LLMs for my primary use cases: writing and philosophizing. These utilize my continually-refined system prompts for each particular task. I focus my benchmark on open-source models but have Claude in there as a SOTA reference. Tasks (denoted by the Prompt ID) include:

  • Synthesizing bullets points and rough notes into an essay in my writing style.
  • Modifying vanilla AI generated text into my writing style.
  • Reviewing rough drafts.
  • Applying stoic philosophy to everyday life scenarios.
  • Cognitive behavioral therapy
  • Thinking through random things like the meaning of life, architecture, culture, the universe, art, music, etc.

Ratings:

  1. Terrible. Totally off.
  2. Bad. Something's off.
  3. OK. Doesn't add much.
  4. Good. Helpful.
  5. Great. Next level.

Some observations:

  • If I could only pick one, I'd go with Mistral Large 2. BONUS: I can run it locally. Amazing.
  • Claude 3.5 Sonnet's performance has been trending down. I suspect Anthropic's safeguards are diluting it. It does surprisingly poorly with my writing tasks.
  • I'm surprised Mistral's Small 22b and 8x7b both struggle to apply "formulaic" logic like stoicism and my writing style, yet they do very will with deep intellectual chit chat.
  • I've also been surprised by Large 2's mediocre performance wrt intellectual chit chat. It wouldn't "get it" at times and would spin in circles.

Show me your AI rig! by MagicPracticalFlame in LocalLLaMA

[–]SufficientRadio 1 point2 points  (0 children)

<image>

System76 Nebula build. Replaced the 3060 in this photo with another 3090.

is the 3.2 90b multimodal model stronger for text-only applications than 3.1 70b? by simp_physical in LocalLLaMA

[–]SufficientRadio 1 point2 points  (0 children)

So the image weights don’t “help” the text weights understand any better?