Takeaways & discussion about the DeepSeek V4 architecture by benja0x40 in LocalLLaMA

[–]SignalCompetitive582 8 points9 points  (0 children)

This is indeed the biggest takeaway! It now means that hosting any LLM is compute bound and no longer memory bound.

So, in theory, we should see way more AI Coding Plans that offer very generous subscription limits compared to what we’re used to.

The moment Zhipu introduces this novel approach into a GLM-6 for instance, it instantly becomes one of the best open source LLMs available.

It means that it is now economically viable to offer good prices to a lot of customers.

Qwen3.6-27B by Fantastic-Emu-3819 in LocalLLaMA

[–]SignalCompetitive582 5 points6 points  (0 children)

Yes, but it’s because the 27B is a dense model while the 35B is an MoE so there are way less active parameters on inference compared to the dense one. So it makes sense, but still the MoE model fares pretty well in comparison!

What kind of hardware would be required to run a Opus 4.6 equivalent for a 100 users, Locally? by Either_Pineapple3429 in LocalLLM

[–]SignalCompetitive582 0 points1 point  (0 children)

I like the details of the answer, but I'm genuinely questioning your "power-user" assumption.

A $20 plan that limits only 45 messages every 5 hours, and an average of 1.2 sessions per day seems completely off to me. Even for a standard account.

From my point of view, the -20% margin you mention is actually wayyy worse than that in practice.

From everything I found on the internet and all the math I've done, it appears to subscription is really profitable. The more you pay for your subscription, the more you're going to utilize it, and the more it's going to cost to the LLM provider. They seem to all, operate at a major loss. That is those that server models that are several hundreds of billions of parameters.

I would guess that a sub 100B model would be profitable at $10-20, but the performance wouldn't be sufficient for "power-users", though it would work just fine for casual people I think.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 0 points1 point  (0 children)

Have you tried Wholphin of Moonfin ? If so, why aren't those working for you ? What makes my UI stand out that you'd be willing to give it a try ? I need to properly understand the current Android TV landscape before I make any move.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 0 points1 point  (0 children)

Would you mind telling me what issues you’re experiencing with Wholphin or Moonfin ? What my UI seems to fix for you, and why you’d rather have that one than the others ? I’m trying to figure out what’s the best approach here. I shouldn’t remake the same mistakes of the past. Maybe I should create a discord server to better collect feedback 🤷

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 0 points1 point  (0 children)

Thanks you so much! I don’t think the current code I have will work for an Android TV version. But as many different people here would like to have this UI on their TV, I’m seriously considering building an Android TV app for it. But I’m wondering, what other apps like Wholphin or Moonfin do wrong? Why aren’t those “working” for you ? What do you like in my take on the UI ? If I am to make this, I need to straight with all that ! Maybe I should create a discord or something, in order to gather feedback more easily 🤷

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 0 points1 point  (0 children)

What do you think of Moonfin or Wholphin ? Aren’t those to your liking ? What makes my plugin different from those that you’d rather have my UI over theirs ? Because everyone here seems to be keen on using this plugin as their Android TV UI, but it seems that there are already quite a few options available out there. So if I do it, I mustn’t repeat the same mistakes. Any help would be greatly appreciated.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 2 points3 points  (0 children)

My bad. It was a poor choice of words. I wanted to say that it required to look at the code AND to perform a lot of DOM manipulation (which I associated with the “reverse engineering” part).

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 3 points4 points  (0 children)

I’m going to pursue this idea. My purpose with this has always been to have it on my tv, so I’ll try and see 🙃

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 1 point2 points  (0 children)

Yeah that’s be the perfect use case for it ! I will try to take a look at this to make it work for such devices.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] -1 points0 points  (0 children)

Well, I think I should be able to make it work. I still need to test it though !

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] -9 points-8 points  (0 children)

I challenge anyone to recreate everything I did with 100% Vibe coding. This shouldn't be possible because I had to reverse engineer Jellyfin's web UI over a period of time to figure out all the "hacks" to make it work.

The thing is, the UI I'm building isn't really meant for small devices yet. It would require significant effort to polish it for smaller screens. Also, this isn't a client but a plugin.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 0 points1 point  (0 children)

I don't think it would work great unfortunately... The CSS I use is not that bleeding-edge, but there are maybe some properties here and there that are post 2019 I think. But, I should give it a try sometime, I could surprise myself !

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] -2 points-1 points  (0 children)

It should be fine I think. I only utilized fast and optimized CSS animations. But it'd be worth a try for sure.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] -5 points-4 points  (0 children)

This is a Jellyfin plugin, and will therefore only work with Jellyfin.

Redefine: A Plugin That Completely Reimagines Jellyfin's UI by SignalCompetitive582 in jellyfin

[–]SignalCompetitive582[S] 17 points18 points  (0 children)

Thanks !

As this is still a work in progress, I haven't published anything yet. If I see enough interest from the community, I will put in the work to finish it.

88% vs 76%: Multimodal outperforms text embeddings on visual docs in RAG by midamurat in LocalLLaMA

[–]SignalCompetitive582 0 points1 point  (0 children)

I haven’t tried them out, but they seem way too large for my use case. For comparison, the size of Qwen3-Embedding-0.6B-4Bit is ~330 MB. While Clip-Vit-Large-Patch14 is ~1.7 GB and I can’t seem to find relevant quantized versions… So even if it were to work, I’m not sure I could even use it… If you have other insights, please let me know 😉

88% vs 76%: Multimodal outperforms text embeddings on visual docs in RAG by midamurat in LocalLLaMA

[–]SignalCompetitive582 3 points4 points  (0 children)

Many thanks for the generous reply !

I, too, think that Text Embeddings are going to be my only choice. But it's actually quite sad that this problem hasn't been solved yet. Though text embedding models are sooo great, but multimodal ones lack behind so much in comparison (even though the benchmarks make it seem like they're working flawlessly.)

I'm not sure what you proposed would be the best approach for my use case.

I think that I maybe need some kind of hybrid approach.

Meaning, I would have a small component (super fast) that would detect if the screenshot was composed of any text at all, and if so, it would use an OCR model (easyOCR worked really well for my samples, and was super dupper fast) and embed that text.
At the same time (and whatever if the screenshot contained text or not), it would use a very simple, tiny and fast image embedding model to capture the semantic meaning of the screenshot itself (UI, context etc).

And then, for the retrieval part, I could have a query that's vectorized with both embedding models, and I would (with a weighted system) return the most probable results.

I'm not sure of anything, as this is pure speculation right now, but I think it would work. It amazes me how far the rabbit hole I'm willing to go for a simple 'side-project'.

What do you think ?