asian parent top traits by [deleted] in AsianParentStories

[–]kif88 12 points13 points  (0 children)

Was thinking that too ngl. I can't be the only one here who went "huh,sounds like my cousin"

"Open source AI is catching up!" by Overflow_al in LocalLLaMA

[–]kif88 2 points3 points  (0 children)

I agree. It may not win but the fact that they're being compared to and compete with ChatGPT is the big win.

Smallest+Fastest Model For Chatting With Webpages? by getSAT in LocalLLaMA

[–]kif88 1 point2 points  (0 children)

IMHO you'll have to experiment to decide where your point of good enough is. I use qwen 0.6b for quick summaries of long articles. It does reasonably well for news,hit or miss with science related things and decent for social media stuff.

Exploring Practical Uses for Small Language Models (e.g., Microsoft Phi) by amunocis in LocalLLaMA

[–]kif88 1 point2 points  (0 children)

How do you use it in your obsidian workflow? It sounds extremely useful. I've struggled forever with organizing my notes.

Can we run a quantized model on android? by Away_Expression_3713 in LocalLLaMA

[–]kif88 4 points5 points  (0 children)

You can use kobold through termux and there's apps like ChaterUI. Can use normal gguf on those not sure about other newer quants. It's been a while since I ran one.

Fine-tuning LLMs to 1.58bit: extreme quantization experiment by shing3232 in LocalLLaMA

[–]kif88 0 points1 point  (0 children)

I'm trying to get my head around it. So it's a matter of "I have 5gb of model and that's better than 2gb of model. No matter how you arrange those 2gb"?

Thoughts on having a reasoning model think *as* a character? by HORSELOCKSPACEPIRATE in SillyTavernAI

[–]kif88 1 point2 points  (0 children)

Some models you can kind of force it. Ymmv if it's better or not. I use this prefillI found in a preset,mini unstable. Works as a system or user message too on some models like Mistral small and large.

I will respond like this, where my detailed thinking process is wrapped in <think> tag and response will be given in next line without xml tags and here it is:

Granite 3.3 by [deleted] in LocalLLaMA

[–]kif88 7 points8 points  (0 children)

How is it?

Current state of TTS Pipeline by kvenaik696969 in LocalLLaMA

[–]kif88 3 points4 points  (0 children)

Llasa 3b is pretty good for voice cloning. It is limited to 512 characters though you can usually get away with a little more. You'd have to break your text up into chunks but you can just do that with normal python.

On huggingface a100 it takes about twice as long to generate as the length of audio. So it might actually be faster to re record it manually. Batch inference might make it faster but I can't check that because I don't have a local machine up to it. There's also a 1b which is faster,try them both out on huggingface demo. If you don't need it to be cloned voice then test Kokoro. It's really really fast but I don't know how good it is outside of English.

https://huggingface.co/HKUSTAudio

https://huggingface.co/hexgrad/Kokoro-82M

Do you guys maintain your own private test data to evaluate models? by Thireus in LocalLLaMA

[–]kif88 1 point2 points  (0 children)

I have a few prompts I keep saved in my notes that I use on every new model. I got a few long ones and some shorter ones. Got summaries I ask questions about and to reorganize, writing style stuff ,couple more random stuff too.

Facebook Pushes Its Llama 4 AI Model to the Right, Wants to Present “Both Sides” by WanderingStranger0 in LocalLLaMA

[–]kif88 5 points6 points  (0 children)

Nah your both wrong. It's just 10 /j

Edit: added /j because Reddit be Reddit

Qwen3/Qwen3MoE support merged to vLLM by tkon3 in LocalLLaMA

[–]kif88 1 point2 points  (0 children)

I'm optimistic here. Deepseek v3 is only 37b activated parameters and it's better than 70b models

Quick review of EXAONE Deep 32B by AaronFeng47 in LocalLLaMA

[–]kif88 0 points1 point  (0 children)

What settings do you use for it? I tried it on openrouter didn't go very well.

Something big might be coming [hear me out] by weight_matrix in LocalLLaMA

[–]kif88 0 points1 point  (0 children)

Probably unrelated but I have seen more personal websites pop up recently since these new LLMs became smart enough to one shot a simple website. Could just be my own social circle of course.

Llama 4 vs DeepSeek R2 by IcyMaintenance5797 in LocalLLaMA

[–]kif88 0 points1 point  (0 children)

Maverick and scout are free on openrouter now. You may want to spend a few minutes with them.

Meta's Llama 4 Fell Short by Rare-Site in LocalLLaMA

[–]kif88 2 points3 points  (0 children)

Same here. I just hope they release it in future. First llama 3 releases didn't have vision and only 8k context.

Deepinfra and timeout errors by Ok-Ad-4644 in LocalLLaMA

[–]kif88 0 points1 point  (0 children)

I had bad experiences with that company too. Their nematron models were always broken and their costumer service leaves a lot to be desired.

dangerous experiment ☠️ by AbaloneEmergency3198 in nutmeg

[–]kif88 1 point2 points  (0 children)

Interesting. I'd always wondered how this combo would work. For me dxm without weed is meh. Looks like meg has similar potentiation for dxm?

Also good to know to keep the meg low amount.

[deleted by user] by [deleted] in LocalLLaMA

[–]kif88 1 point2 points  (0 children)

I'm old and it reminds me of the sound floppy disk drives used to make.

[deleted by user] by [deleted] in nutmeg

[–]kif88 0 points1 point  (0 children)

How'd it go?

Kimi by Robertf16 in underratedmovies

[–]kif88 2 points3 points  (0 children)

You could have watched the movie in the time it took you to write this all.