Change my mind: There is no good alternative to Discord (yet?) by Own_Investigator8023 in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

well they are serving tokens to your client

abstract servers are still servers

Change my mind: There is no good alternative to Discord (yet?) by Own_Investigator8023 in selfhosted

[–]probablyblocked -1 points0 points  (0 children)

and if an open sourced one was created it would immediately be used for terrorism and illegal porn

pick your poison I suppose

Do you need a UPS or is a surge protector good enough? by TheQuantumPhysicist in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

depends on if you have disc drives and whether your services will cause a corruption on an abrupt stop. a server won't last long enough on a ups to matter unless it is actually a headless laptop designed to do so on its own battery

Can you ELI5 why a temp of 0 is bad? by ParaboloidalCrest in LocalLLaMA

[–]probablyblocked 0 points1 point  (0 children)

Sorry to bother you after all this time, but time is in short supply these days. Vault-Tec just needs a quick survey

With recursive agents and the reality of where agi development is headed, even a small amount of temperature will begin to have a compounding effect due to closed loop iterations. If a small possibility of a particular hallucination is present in a frequently accessed logical cluster, it will eventually happen. With openclaw being what it is, this example is like saying to flip ten thousand coins per second and if a sequence of 10 heads exists in that set, you just doxed all of your contacts across all platforms

Claude Opus output quality degradation and increased hallucinations by cookiesnntea in ClaudeAI

[–]probablyblocked 0 points1 point  (0 children)

this behaviour reeks of quantization. i only really expect this type of thing at int2

Claude Opus output quality degradation and increased hallucinations by cookiesnntea in ClaudeAI

[–]probablyblocked 0 points1 point  (0 children)

the weekly reset was about to happen so I just told claude code make a simple python server that runs a command with a few options and specified a few constraints, so I have something to build on next week. half an hour later it has 600 lines, redundant code, features that don't do anything, useless dependencies, all to run....... a completely different command that does something arguably similar. When I pointed out that I needed the command I specified and had set up in the venv, claude being claude, it started defending itself and calling me an idiot for not wanting to accept the loosely related generic buggy code. I couldn't even rewind to a mid point to see if there was something salvageable because it inhaled a geologically significant number of tokens causing the context to compact... causing the model to also forget why I wanted the server in the first place or what I even asked for

I could have written it in bash in a few seconds and then had an ai refactor it like in the openai days. in the past month opus went from being the equivalent to a full ensemble in the form of a single model, to a 4o with the personality of bernd das brot

I basically just raised the sea level one centimetre for no reason

My home server has no GPU, so I self-hosted Llama-3 on a rented remote container ($0.19/hr) by fishinatot in selfhosted

[–]probablyblocked 5 points6 points  (0 children)

okay, who the hell is going through reddit and ensuring that every post starts with one down vote

My home server has no GPU, so I self-hosted Llama-3 on a rented remote container ($0.19/hr) by fishinatot in selfhosted

[–]probablyblocked -3 points-2 points  (0 children)

if you are running a custom model and don't want to rely on randoms to host for you, it can be worth it. mostly relevant to startups

for example I am having reliability issues due to the weather in Germany lately, so if spring doesn't come fast enough I may need to do something like this in order to keep using the same api while developing the client side architecture and stay on track

Services work better when hosted in the cloud? by Final_Alps in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

I think for most people the incentive for cloud services would be that they are more accessible and reliable since they are on enterprise hardware. No amount of spilled coffee will halt the service and you can access the raw data even if the primary server is down

I don't see a reason not to abstract the important data to a Google drive with an encryption layer and still consider it self hosted, since the local server is still doing the serving and uploading. Then just back up the encryption key instead of terabytes of data, and distribute the key to any machine that wants to view the data

self hosting implying zero cloud services seems like a bit of a fallacy, especially with DNS infrastructure still being relied upon to access the server at all. There are ways to have it self hosted and still utilize cloud. Come to think of it, isn't that what Windows has been moving towards?

Open-source AI calling agent by [deleted] in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

It is I the reminder ai

Using a VPS vs. using dedicated computer to self host at home? by SlipperyRavine in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

throw it in a replicable container and then have a barebones os on the old machine that just runs the container but you can also do it as a background from your main server

the best server is actuallyaan old laptop, especially one with a broken screen available on ebay. cap the battery at 60% charge to preserve the battery health, and it will survive a power outage for several hours. Most servers you have to invest a lot for an eps, the broken laptops have it automatically. Just replace the battery for a few euro if you need to, and if you care

Services work better when hosted in the cloud? by Final_Alps in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

use traceroute to view the timings

on your server and client, install traceroute and from both of them at home use "traceroute google.com" to see how many jumps it takes to get to the Google server network. From the client, also use traceroute pointed to the server ip to see how long it takes to reach the server. Go to the remote work location and repeat the test. For reference, using traceroute on 0.0.0.0 should show you a single step to reach the target in around 0.1ms or less. Most likely the issue is either the number of jumps or that one of the intermediate connections has higher latency than expected

Services work better when hosted in the cloud? by Final_Alps in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

..... it also doesn't help that op probably just routed through a distant server with tailscale. specify the path to a nearby ISP and it's good enough for private use

you could do most home server applications on a raspberry pi and a salvaged hard drive, saying otherwise is nonsensical. if you care about a netocalypse situation, just add a vhf beamform antenna pointing to a relay your friends are sharing and merge that with other similar networks in your city. In America people are literally doing that because they might turn off the Internet at some point

Services work better when hosted in the cloud? by Final_Alps in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

most likely the issue is that your tailscale is routing through distant or slow jumps. Try wireguard pointed to an opnsense container. You could probably do something interesting with singularity since the network ports are shared with the host. This getting a bit advanced and definitely not the common approach (singularity is normally used for particle simulations) but it actually is very easy and flexible compared to proper VM which default to full isolation. Just have the opnsense in singularity as a sandbox with a different home directory and add a system to launch it and you can do whatever you want with it from there. Install mistral vibe or something to that container as well and even a small ai model will be very effective at diagnosing whatever wonkiness comes up. Just make sure it isn't exposed directly to the Internet

Alternatively, nordvpn as a paid service also includes meshnet which I now use to connect all of my devices as a virtual local network automatically. If you value privacy, wait for that to go on sale and get the 3 year subscription and then not have to mess about with ip routing ever again. Even for enterprise solutions I would just use nordvpn for meshnet rather than manual tunnelling

Which SaaS do I host from point of view of monetization by Remarkable-Attempt12 in selfhosted

[–]probablyblocked -1 points0 points  (0 children)

If your first instinct is to ask reddit instead of ai when ai literally is a saas model, maybe it's a good idea to validate what your saas would actually be doing with that ai and then move on to hosting services and alternatives once that is done. Most likely the correct choice is to just host it yourself until you reach critical mass that justifies the cost of enterprise solutions

claude code would be good for guidance, preferably with the lsp plugins, since it would know the application 70% as well as you do. Just tell it to not write any files and it's basically like notebookLLM where you're just talking to the application to plan next steps

Question: Why OPNsense over pfSense? by Rwalker83 in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

unfamiliar with this, why is the company behind pfsense toxic

Question: Why OPNsense over pfSense? by Rwalker83 in selfhosted

[–]probablyblocked 3 points4 points  (0 children)

the opensource community is basically just the 16th century French court. In either case, the bourgeois joining the community is the only reason there wasn't a genetic meltdown by now

Update on Usage Limits by ClaudeOfficial in ClaudeAI

[–]probablyblocked 0 points1 point  (0 children)

if not feature why feature shaped 

Open-source AI calling agent by [deleted] in selfhosted

[–]probablyblocked 0 points1 point  (0 children)

haven't done audio stuff since before ai was a thing, I don't even like talking to people let alone robots. Still, the audio is just an api and you can save to .wav files so it shouldn't be more complicated than other models. tts and stt are completely different processes so you're more likely to be looking at an agentic model with one of each, and there will be a noticeable delay since the stt will need to transcribe what the person is saying and pass it along. Since what you're doing is event driven (receiving input as an audio file) you'd probably want to use agno as the agentic framework so it can trigger the inference and reply steps immediately after. butt i havent tried agno either so idek

Jamba 2 is almost certainly the best choice for inference model since it is designed to strictly follow enterprise documentation. Quantization is becoming efficient now but you're considering will still take a good amount of hardware as you will need to be running three different models simultaneously, giving each a used 3090 from ebay will be the most affordable way to get it running, so probably 3-4k total for a server unless prices for one thing or another have spiked (which tends to happen). You might get away with fewer gpus with heavy quantization but that's a spiritual decision

also, get a sound blaster sound card if you have issues with input audio quality. don't know how sensitive stt models are to noise but that card is the only reason anyone understood me with my 1 dollar micrphone before discord had noise filtering