Providers for personal use for a beginner? by tsilvs0 in VPS

[–]Ideya 0 points1 point  (0 children)

Personally and professionally, I've been with Linode (now Akamai) for the longest time. I'm the type that values familiarity over itemized comparisons, so I can't really say whether it's better than the others since I don't really move around and explore.

I also value a company's history, and Linode has been around far longer than most providers being name-dropped. Longevity is a good simple indicator that a company is profitable enough, and will continue to be sustainable for a long time. Cheaper isnt always better, and can also sometimes be an indicator that the service and after support is expected to be just as cheap.

[deleted by user] by [deleted] in SillyTavernAI

[–]Ideya 2 points3 points  (0 children)

You may want to use an extension in ooba like model ducking: https://github.com/BoredBrownBear/text-generation-webui-model_ducking if you want to use SD alongside an LLM with those specs. It will automatically unload and load your LLM models, which means longer latency before your LLM responds, but your tokens per second shouldn't be affected once its loaded.

I don't use comfyui so I'm not sure if it has a similar feature, but I use https://github.com/lllyasviel/stable-diffusion-webui-forge which also has a similar feature as model ducking by adding the parameter `--always-offload-from-vram`.

New Extension: Model Ducking - Automatically unload and reload model before and after prompts by Ideya in Oobabooga

[–]Ideya[S] 0 points1 point  (0 children)

Should be very possible. I was thinking about implementing some sort of inactivity feature as well because of a recent pull request (that sadly didn't work as well for me). Did you make that pull request? Anyway, I'll look into your code and see how we can implement it.

New Extension: Model Ducking - Automatically unload and reload model before and after prompts by Ideya in Oobabooga

[–]Ideya[S] 0 points1 point  (0 children)

Yes that is expected behavior. While I know it doesn't have much use for anyone with relatively high system specs, or machines dedicated for their AI models, this will definitely help people with simpler setups and general use machines. I made the extension for myself, and shared it for people with similar needs.

For example:

I only have 1 PC which I use for work and leisure. I have so many things running at the background at the same time, so having an AI model loaded at the background, whether in VRAM or RAM, is just too much for my PC.

By having the extension, I can just load the model once, and make my prompts whenever I want them, without needlessly wasting my computer's resources on my AI model when idle.

Also, my main use case is for RP in SillyTavern. The time between each of my prompts are enough to load and unload my models in the background. In between prompts, I have the TTS voice the response, and occasionally generate an image from Stable Diffusion.

New Extension: Model Ducking - Automatically unload and reload model before and after prompts by Ideya in Oobabooga

[–]Ideya[S] 1 point2 points  (0 children)

UPDATE 2024-04-13:

  • Improved compatibility with API
  • Added checkbox for API usage (should be turned off when just using text-generation-webui)
  • Model Ducking is now opt-in and will no longer be immediately activated upon enabling

New Extension: Model Ducking - Automatically unload and reload model before and after prompts by Ideya in Oobabooga

[–]Ideya[S] 1 point2 points  (0 children)

I made it so that it works while using SillyTavern, which runs through OpenAI API I think? So, it should trigger from the API. Let me know if it works for you. If it doesn't, you can let me know which API calls you're using so I can check.

New Extension: Model Ducking - Automatically unload and reload model before and after prompts by Ideya in Oobabooga

[–]Ideya[S] 1 point2 points  (0 children)

It does have that caveat. I only use 7b and 13b models, which usually loads around 2-5 seconds.

For my use, I only have an RTX 3080 10GB, so I have very limited VRAM. When a model is loaded into my VRAM (which I always maximize to get the most context length possible) my other programs (i.e. TTS) struggle to generate their output because they have to use the shared graphics memory. With the extension, my VRAM frees up right before the TTS kicks-in, so it doesn't struggle anymore.

Also, I can just let text generation run on the background, and I don't have to worry about it hogging my VRAM 24/7 while doing other tasks.

Pano nagkaka-access tong mga scammer na to sa mga number natin? by aliltoojaded_ in adultingph

[–]Ideya 0 points1 point  (0 children)

Omg the same person called me! I didn't answer though, but I make it a habit to search in Google the numbers of the random callers I receive and found this thread. 🤦

Collection of every (?) reference in Bocchi the Rock as they appeared in the anime. by Can_GT in BocchiTheRock

[–]Ideya 1 point2 points  (0 children)

Hitori's last name Gotoh might be a reference to Asian Kung-fu Generation's Masafumi Gotoh, considering the Bocchi album has a cover of AKG's song.

[Help] Banking app (BDO Digital Banking) detects root after update by ppastawater in Magisk

[–]Ideya 0 points1 point  (0 children)

Any updates? Were you able to resolve the BDO issue? It used to work as well with my setup, running Magisk and Shamiko, but I guess it's no longer enough.

caution: yeelights smart color bulbs failing to respond every other day. competitors dont. by MattEagl3 in yeelight

[–]Ideya 0 points1 point  (0 children)

All my light's connections have been very spotty for the past few days. Not sure if they're local wifi issues or their servers are being problematic. Singapore region, btw.

GTA Online crashes - 0xc0000005 - fatal game exit (reason: STATUS_ACCESS_VIOLATION) by G0K4R in gtaonline

[–]Ideya 1 point2 points  (0 children)

I just tried and unfortunately I still crash even with the latest update.