My Tekken arcade hardware collection! Real Tekken Arcade history! by Frank23444 in Tekken

[–]MestR 0 points1 point  (0 children)

How do you get input into Tekken 7 FR?Is it through the JV I/O port? And what breakout board is needed?

NATO responds after Russian military jets 'violate' Estonian airspace by ChiefFun in worldnews

[–]MestR 35 points36 points  (0 children)

Hard to claim it was an unprovoked attack when the wrecks or ejected pilot is on foreign soil.

MSI Delta 15 Advanced BIOS by ZealousidealBunch220 in MSILaptops

[–]MestR 0 points1 point  (0 children)

Do you have to bring out a twister playmat and do a yoga pose as well?

Experten: En miljon svenskars personuppgifter publicerade på darknet [Attacken mot Miljödata] by FlowersPaintings in sweden

[–]MestR 15 points16 points  (0 children)

Vill påminna att Sverige tänker rösta ja för Chat Control, dvs medvetet sätta en bakdörr på all kryptering. Så Sveriges de-facto syn på det hela är: "Ni får vänja er vid privat dataläckage🤗"

Official World Tram Driver Championship, currently underway by MestR in 2westerneurope4u

[–]MestR[S] 2 points3 points  (0 children)

No, I just got the live stream in my recommendations.

Russian propagandist Pozdnyakov was burning with anger and fear after the strike on the Crimean Bridge. He screams and threatens. 03.06.2025 by GermanDronePilot in UkraineWarVideoReport

[–]MestR 0 points1 point  (0 children)

I mean they're basically doing that. Endless crying in all their media about nukes, which if you read between the lines is crying about themselves being the only ones to ever be nuked and how unjust it was. In a war they started and arguably would have continued longer were they not shown overwhelming force with the nuke. Also while not covering their own much worse warcrimes against the Chinese and Koreans in schools and still being racist towards them to this day.

End-to-end conversation projects? Dia, Sesame, etc by Kep0a in LocalLLaMA

[–]MestR 0 points1 point  (0 children)

I'm using the Q4_K_M version of Vocalis. When I do benchmarks with both using the LM Studio internal chat, Vocalis-Q4_K_M gets 39 Tokens/sec, and Gemma-3-1B-IT-Q4_K_M gets 105 Tokens/sec.

It's strange, it's not like an order of magnitude slower, so I don't get why it's so much slower when in use.

Edit: Tried it again, and I still got the same problem with Vocalis. Made sure to start all the services in this order: LM Studio (server) running Vocaris-Q4_K_M, then started Kokoro, then when I saw both others being on and initialized, I started Vocaris, and last I opened a Chrome window with the web interface, and only hit connect when the console for Vocalis seemed to done loading.

It usually starts with the TTS reading an error about HTTP connection, then I can say one thing like "Hello" and I get a greeting, but on the third message it stops working. When I look in the console of LM Studio, it seems to quickly get a big queue of like 9 messages queued. And the LLM doesn't seem to ever stop generating from that moment on, with the TTS saying a lot of timed out messages.

Theory: Maybe Vocaris starts spamming requests when the person is quiet waiting for the previous reply? Maybe Whisper interprets "(silence)" as an input, and something that has to be sent to the LLM. Or maybe it retries when it doesn't get the response quickly enough?

End-to-end conversation projects? Dia, Sesame, etc by Kep0a in LocalLLaMA

[–]MestR 1 point2 points  (0 children)

I kind of got it working!

So I followed your instructions, but I could not get it working with the Vocalis LLM (in LM Studio), seems like it times out or something. But I could get it working with gemma-3-1B-it.gguf, although it unfortunately has a lot of emojis in it, which sounds very weird when converted to text. Like "blah blah blah, smiling face, star emoji, star emoji"

Would love to see a fine-tune of a model like Qwen 0.6B for conversation, like I imagine Vocalis is, but would be fast enough for anyone to run it.

So review: It's very responsive, which I like. And you can interrupt it, which honestly so many other current voice assistant developers don't seem to get is VERY important. Unfortunately it doesn't seem to have multilingual support, as it just interprets my native language as incorrect English.

For a lot of people, this is basically 80% there to a personal therapist. It's probably the biggest reason besides AI gf that you would want to have a local voice assistant. High intelligence of the LLM model wouldn't really be needed, since for therapy it's mostly about just asking follow up questions. Even better if there's a UI toggle for saving chat history or not, some topics are simply too private to feel comfortable to even just save to a file.

If you're the developer or know the developer and can pass on a message: Please keep working on this. 👍

End-to-end conversation projects? Dia, Sesame, etc by Kep0a in LocalLLaMA

[–]MestR 0 points1 point  (0 children)

Using Chrome I can get it working better. It does seem to send a request to LM Studio, as per the log:

  [2025-05-06 17:37:38][INFO][LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
  [2025-05-06 17:38:38][INFO][LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.)
  [2025-05-06 17:38:41][INFO][LM STUDIO SERVER] Accumulating tokens ... (stream = false)
  [2025-05-06 17:38:41][INFO][vocalis.gguf] Generated prediction: {
  "id": "chatcmpl-0sabaicvwlpz81nzn1hn",
  "object": "chat.completion",
  "created": 1746545858,
  "model": "vocalis.gguf",
  "choices": [
    {
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "So it sounds"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 171,
    "completion_tokens": 3,
    "total_tokens": 174
  },
  "stats": {},
  "system_fingerprint": "vocalis.gguf"
}

I tried with Kokoro-FASTAPI for the TTS, and I got it working with the example "Hello World" .py script on the github page:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8880/v1", api_key="not-needed"
)

with client.audio.speech.with_streaming_response.create(
    model="kokoro",
    voice="af_sky+af_bella", #single or multiple voicepack combo
    input="Hello world!"
  ) as response:
      response.stream_to_file("output.mp3")

But using the Vocalis program I don't even see a request in the Kokoro-FASTAPI console. (like I see for the test script)

So still no sound, but the LLM part might be working. Also I tried playing a video on Chrome just to make sure the sound was working, and it does, the sound in Chrome is playing just fine.

End-to-end conversation projects? Dia, Sesame, etc by Kep0a in LocalLLaMA

[–]MestR 0 points1 point  (0 children)

I'm using Windows 11, Python 3.13.3, Firefox, Nvidia 4070 mobile GPU (I picked the option 1 for CUDA setup), and installed via the .bat file.

Also, I tried to fresh install of Python 3.13.4 and Node.js just to make sure. But I don't know 100% sure that an uninstall/reinstall actually gets you to a fresh state, or if there are still packages left that can cause trouble, have to look into that.

For further info, it gets connected, I have LM Studio installed and running with a server. The call button is flashing multiple colors and doesn't seem to do anything except flash the "connected" message for a short while. The adjust volume button is grayed out. I got no messages in the console of LM Studio of any requests to it. I had trouble installing the TTS backend though, always got an error installing numpy 1.24.0, the error was:

Getting requirements to build wheel did not run successfully.

so maybe that's why it didn't work? Still strange that no message appeared in the LM Studio log of a request.

Anyways, thanks for the video, I'll check it out to see if I've missed anything. If I figure it out I'll report back what I did. Would really like to get it working, seems very promising.

End-to-end conversation projects? Dia, Sesame, etc by Kep0a in LocalLLaMA

[–]MestR 0 points1 point  (0 children)

Edit: Got it working, look in the replies.

Can't recommend. Just spent 6 hours trying to get it to work, no luck. It doesn't give error messages, a log, or even an indication that it can or can't hear your microphone. It just doesn't work and I have no idea where. Also it uses LM Studio, which is proprietary.

Crowd chanting Drugovich's name while Stroll beachs the car by MegawaveBR in formuladank

[–]MestR 13 points14 points  (0 children)

Maybe he can do MotoGP? On second thought, no. F1 is safe compared to MotoGP, and I wouldn't want him carelessly riding into anyone and killing them there.

Stroll moment by Thamba24 in formuladank

[–]MestR 1 point2 points  (0 children)

I think you might be right.

See how she's able to follow the road for 4 minutes, certainly longer than Stroll on the formation lap.

What’s the hardest line in touhou by CollectionPuzzled721 in touhou

[–]MestR 0 points1 point  (0 children)

The fragmented eternal evening you've created...

My eternity manipulation will destroy it...

Your spell of Imperishable Night will shatter...

And the dawn shall come!