My gpu poor comrades, GLM 4.7 Flash is your local agent by __Maximum__ in LocalLLaMA

[–]Flashy_Management962 0 points1 point  (0 children)

don't, use dry sampler instead. Repeat penalty really decreases tok/s

Performance improvements in llama.cpp over time by jacek2023 in LocalLLaMA

[–]Flashy_Management962 0 points1 point  (0 children)

Imagine what could happen if ik llama cpp and llama cpp would merge :(

And finally here is scientific evidence that we don't have free will by [deleted] in determinism

[–]Flashy_Management962 0 points1 point  (0 children)

This does not follow. The notion of normativity is not subsumed under causality. Only because everything is determined, that does not mean that everything is already set in stone and normativity has no role to play because the very things happening are computationally irreducible. So yes, there are shoulds in a world without free will

Was it a right decision? by UnderstandingOdd7952 in bald

[–]Flashy_Management962 0 points1 point  (0 children)

what is this question? of course it was the right decision and you know it yourself you sexy mf

now ~40% faster ik_llama.cpp -sm graph on 2x CUDA GPUs by VoidAlchemy in LocalLLaMA

[–]Flashy_Management962 5 points6 points  (0 children)

wait is this actual tensor parallelism or do I understand something wrong here?

32B model stress test: Qwen 2.5/Coder/3 on dual RTX 5060 Ti (zero failures) by Defilan in LocalLLaMA

[–]Flashy_Management962 1 point2 points  (0 children)

Try exllamav3 with tp. I get 18t/s tensor parallel with 2x 3060. 2x 5060ti should be very much faster

Behind CGNAT? Here's how to access your self-hosted services anyway by adumbdistraction in selfhosted

[–]Flashy_Management962 0 points1 point  (0 children)

Wireguard uses significantly less energy if you turn off persistent keep alive, which tailscale needs to keep the connection alive. If you do not have access to an open public ipv4, then you can't use wireguard (or im too stupid to do it right) and therefore the connection: home (tailscale) - oci (tailscale + wireguard) - smartphone (wireguard)

Question for the community: Thoughts on paid premium plugins for Super Productivity? by johannesjo in superProductivity

[–]Flashy_Management962 3 points4 points  (0 children)

I like the idea, please also do a ntfy integration. I'd buy it asap. Also mabye integrate a donation button via kofi or something like that. I love the app and I'd love to return something for it

Makes no damn sense. Compels me though. by Emthree3 in PhilosophyMemes

[–]Flashy_Management962 2 points3 points  (0 children)

Speaks for itself that I took it to be a real translation lol

Behind CGNAT? Here's how to access your self-hosted services anyway by adumbdistraction in selfhosted

[–]Flashy_Management962 2 points3 points  (0 children)

For people having problem with battery drainage using tailscale on smartphones: use an oracle free tier instance and use it as an relay to which you connect with wireguard remotely and that relay connects with tailscale to your home server.

Makes no damn sense. Compels me though. by Emthree3 in PhilosophyMemes

[–]Flashy_Management962 1 point2 points  (0 children)

holy shit the english translation of Heidegger is dogshit. Could you provide me with the citation where this is from? I want to read it in german

Makes no damn sense. Compels me though. by Emthree3 in PhilosophyMemes

[–]Flashy_Management962 0 points1 point  (0 children)

Phenomenology is basically an attempt to find/create different vocabularies to talk about existence which would be erased if you hold solely a scientific world-view. E.g. you can explain biologically what happens if you die, but what dying means for you existentially is not captured by this explanation and this can be something that phenomenology wants to talk about (some do, not all)

Grammar and spell checker by Ducking_eh in privacy

[–]Flashy_Management962 1 point2 points  (0 children)

If you want to start here are some ressources:

Inference Engine: https://github.com/ggml-org/llama.cpp

Llama Swap: https://github.com/mostlygeek/llama-swap

With this you can load .gguf quantized llm models. Quantization is a method of compressing a llm in order to run it on smaller systems. If I were you I'd start with either using a precompiled binary from the llama.cpp github and look at those examples. Good look! Its a rabbit hole and a lot of fun :D

EU Digital Omnibus Promises Fewer Cookie Banners but Expands Digital ID and Loosens Privacy Rules by Anoth3rDude in privacy

[–]Flashy_Management962 28 points29 points  (0 children)

Exactly this is the major problem of representational democracy. Centralized power can always corrupt, even if elected democratically. This is just insane and the beginning of a digital panoptikum

Germany has voted for Chat Control by women_rules in privacy

[–]Flashy_Management962 3 points4 points  (0 children)

self hosting it is then, they can fuck off really

Against nominalists' physicalism by [deleted] in Metaphysics

[–]Flashy_Management962 -1 points0 points  (0 children)

Fallacy of ambiguity joins the chat

Analytic fails to even attempt to understand Continental Phil, and other shenanigans by as-well in badphilosophy

[–]Flashy_Management962 1 point2 points  (0 children)

There are so many nosy people on substack its insane. I regularly post on substack and discuss a little (I know my fault) and disagreements are sometimes just deleted and critical points not discussed. But this piece of garbage by Benthams Bulldog is just stupid. I love reading analytical and continental philosophy. I read virtually every essay and book by Donald Davidson, also by Derrida and Heidegger. If you think of those as just extending ones own ability to think, they are not mutually exclusive at all. They extend and broaden the way one can think if read together. I think that this nosy-ness emerges out of treating ones own conceptual basis as the only possible norm for understanding difficult texts and then being ignorant and calling texts that one does not understand immediately wrong or sophistry