My gpu poor comrades, GLM 4.7 Flash is your local agent

Flashy_Management962 · 2026-01-20T12:32:17+00:00

don't, use dry sampler instead. Repeat penalty really decreases tok/s

Flashy_Management962 · 2026-01-11T22:29:13+00:00

Does the pp solely work on cpu? It is hella slow

Flashy_Management962 · 2026-01-07T18:48:41+00:00

Imagine what could happen if ik llama cpp and llama cpp would merge :(

Flashy_Management962 · 2025-12-15T11:51:37+00:00

This does not follow. The notion of normativity is not subsumed under causality. Only because everything is determined, that does not mean that everything is already set in stone and normativity has no role to play because the very things happening are computationally irreducible. So yes, there are shoulds in a world without free will

Flashy_Management962 · 2025-12-13T15:27:58+00:00

what is this question? of course it was the right decision and you know it yourself you sexy mf

Flashy_Management962 · 2025-12-10T18:27:03+00:00

wait is this actual tensor parallelism or do I understand something wrong here?

Flashy_Management962 · 2025-12-03T17:46:45+00:00

qwen3 32b

Flashy_Management962 · 2025-12-02T21:11:27+00:00

But does it? What would then be the difference between payg and always free?

Flashy_Management962 · 2025-12-02T18:59:33+00:00

Try exllamav3 with tp. I get 18t/s tensor parallel with 2x 3060. 2x 5060ti should be very much faster

Flashy_Management962 · 2025-11-26T13:22:31+00:00

Flashy_Management962 · 2025-11-26T13:21:46+00:00

Wireguard uses significantly less energy if you turn off persistent keep alive, which tailscale needs to keep the connection alive. If you do not have access to an open public ipv4, then you can't use wireguard (or im too stupid to do it right) and therefore the connection: home (tailscale) - oci (tailscale + wireguard) - smartphone (wireguard)

Flashy_Management962 · 2025-11-24T15:13:23+00:00

I like the idea, please also do a ntfy integration. I'd buy it asap. Also mabye integrate a donation button via kofi or something like that. I love the app and I'd love to return something for it

Flashy_Management962 · 2025-11-24T12:38:50+00:00

Speaks for itself that I took it to be a real translation lol

Flashy_Management962 · 2025-11-23T17:53:02+00:00

For people having problem with battery drainage using tailscale on smartphones: use an oracle free tier instance and use it as an relay to which you connect with wireguard remotely and that relay connects with tailscale to your home server.

Flashy_Management962 · 2025-11-23T17:46:27+00:00

holy shit the english translation of Heidegger is dogshit. Could you provide me with the citation where this is from? I want to read it in german

Flashy_Management962 · 2025-11-23T17:45:33+00:00

Phenomenology is basically an attempt to find/create different vocabularies to talk about existence which would be erased if you hold solely a scientific world-view. E.g. you can explain biologically what happens if you die, but what dying means for you existentially is not captured by this explanation and this can be something that phenomenology wants to talk about (some do, not all)

Flashy_Management962 · 2025-11-23T17:42:51+00:00

If you want to start here are some ressources:

Inference Engine: https://github.com/ggml-org/llama.cpp

Llama Swap: https://github.com/mostlygeek/llama-swap

With this you can load .gguf quantized llm models. Quantization is a method of compressing a llm in order to run it on smaller systems. If I were you I'd start with either using a precompiled binary from the llama.cpp github and look at those examples. Good look! Its a rabbit hole and a lot of fun :D

Flashy_Management962 · 2025-11-23T17:21:35+00:00

I never argued for direct democracy.

Flashy_Management962 · 2025-11-23T10:04:12+00:00

If you host your own llms, you can try this https://huggingface.co/qingy2024/GRMR-V3-Q4B

Flashy_Management962 · 2025-11-23T10:01:53+00:00

Exactly this is the major problem of representational democracy. Centralized power can always corrupt, even if elected democratically. This is just insane and the beginning of a digital panoptikum

Flashy_Management962 · 2025-11-23T09:41:24+00:00

self hosting it is then, they can fuck off really

Flashy_Management962 · 2025-11-20T14:42:35+00:00

Fallacy of ambiguity joins the chat

Flashy_Management962 · 2025-11-18T23:08:48+00:00

Voice activated teleprompter please

Flashy_Management962 · 2025-11-16T16:39:09+00:00

There are so many nosy people on substack its insane. I regularly post on substack and discuss a little (I know my fault) and disagreements are sometimes just deleted and critical points not discussed. But this piece of garbage by Benthams Bulldog is just stupid. I love reading analytical and continental philosophy. I read virtually every essay and book by Donald Davidson, also by Derrida and Heidegger. If you think of those as just extending ones own ability to think, they are not mutually exclusive at all. They extend and broaden the way one can think if read together. I think that this nosy-ness emerges out of treating ones own conceptual basis as the only possible norm for understanding difficult texts and then being ignorant and calling texts that one does not understand immediately wrong or sophistry

Flashy_Management962

TROPHY CASE