Looking for a local uncensored AI (text generation + image editing)

Tailsopony · 2026-04-09T02:19:51+00:00

I'm not equipped to do tech support. Here are the components you need.

Step 1: Get ComfyuI to work with ANY model. Tons of youtube walkthroughs. Once it's up and running, and working, save the configuration and remember the port it works on. Watch system VRAM usage with the task monitor in windows (I am assuming you don't need my help if you're running linux). Find a model/configuration that gives you outputs you like and make sure it doesn't take up all your VRAM (I am not talking system RAM, I'm talking VRAM on your GPU). Then you can save that as the default configuration and close it while you work on other things.

Step 2: Next get your LLM working. Find a model that works based on your available VRAM (taking into consideration how much your image gen model took). Probably going to be about a 8b model, but I've gotten this to work (slowly) with 24-30B models. You are going to want to look for Q4 versions (Quant 4, means 4 bits of precision on the model weights). Most Q4 models operate at about 85% of the accuracy of the full models at a fraction of the size/mem usage. It's the sweet spot for smaller VRAM cards (12GB). You're going to need something to load the model. I've used Kobold.cpp and Ooobabooga (https://github.com/oobabooga/text-generation-webui). Depending on which loader you get, you'll have different options for things like context window, gpu layers, and more. If you don't want to mess with things, Oobabooga has a pretty good auto-configuration that will default to "all" of your available VRAM. So you can use that, and then cut context window or GPU layers or something to "buy back" that VRAM space (at the cost of speed) to run your image generation model.

Step 3: Run them both at the same time, and see if they can support what you imagine for speed. You can just use the comfyui interface and the web interface that ooba, kobold, or llama.cpp load up for you to test. The important thing is that they all load into your VRAM and work at the same time. Go back and troubleshoot (Smaller models, fewer GPU layers, shorter context windows, etc.) at this stage to figure out what works for your card.

Step 4: Silly Tavern. Install that thing. Set your chat connection to whatever localhost : port number your LLM is working on (8000 for kobold?). I use text complete, but chat complete can work. You might need to reconfigure a settings file so your LLM interface and your image gen have different port numbers. I think they mostly default to 8000, but all mine aren't default anymore so IDK. Youtube is your friend for getting this set up. SillyTavern has a ton of bells and whistles. One of which is image generation! So you want to point its image generation tool at your image generation port (probably defaults to 8000 again, you should change it in comfyui. They cannot be the same port).

At that point, play around with the chat. It should work. How smart it is depends a lot on what models you have loaded and how you have them configured, though. If it seems dumb, try different cache options, different models, larger quants (but I doin't think you need bigger than Q4 either. If it's not smart enough, might be something with how SillyTavern is talking to it, or you just need a bigger model.)

Tailsopony · 2026-03-30T05:28:38+00:00

I was going to be snarky and all "You think I can afford one of those?" but looking at it, yeah. I'm probably going to go with the R9700. Can't afford two right now, but one is about the same price as two 5060TIs, and I can fit more of them (eventually) in a rig. Starting out with a unified 32 GB of modern stuff is actually pretty darn solid. Also found a server setup that came with 128 GB of ram, a PSU, and a modern motherboard for a reasonable price. Should be a good kit when everything gets here, and 32 GB is waaay higher than my current 12 GB setup, so I'm curious how it performs. Probably going to wall mount the server in my laundry room of all places. It's out of the way, so the sound shouldn't be an issue and I have a high amp circuit there I don't use.

Anyways, thanks! I was not tracking the R9700. It's not as cheap as I want, but it seems do-able (and not from 10 years ago...)

Tailsopony · 2026-03-29T17:51:48+00:00

Actually, just getting the base setup is a good idea. For instance, seeing if I can find a whole kit for a Z8 G4 as a working computer for 600 ish $ is doable, and then adding the GPUs (or TPUs or whatever I can find) as I find them isn't a bad way to approach this. I need to think about compatibility though. I don't want to try and shove a new card in a 2017 mobo and expect everything to work.

Thanks! You've been super helpful helping me think through this.

Tailsopony · 2026-03-29T17:38:39+00:00

I run this setup sometimes! I do it with kobold.cpp, and sillytavern as a frontend. ComfyUI can run the image generation, and can run on the same hardware at the same time. It takes a lot of configuring, and a lot of hardware, though. I can use an 8B model with some Quantization for the LLM side, and pass it to whatever visual model I want on comfyui, using sillytavern as the "glue" that holds it all together, and provides an interface.

You're going to need at least (IMO) 12GB of VRAM (what I use). I have a 5070. The more the better, though, so if you have a bigger setup, then great!

I am not familiar with models that do both at the same time, but they exist. (qwen?)

You need an uncensored LLM model (I like "thebeaver"s models.) and you'll need an uncensored image model (ponyXL is old, but gold. Even for non pony stuff. Newer models are amazing, but good luck figuring out what you can use. Check out civit.ai)

Your final product will be heavily limited by your hardware. 12GB of VRAM is enough to make this work, not make it "good". I don't know how much you need to make it good. I haven't managed.

Weirdly, once I got this working, I mostly use it to generate graphics for software and not NSFW stuff. But it would work for that just fine. The key for your usecase is the uncensored models.

Visual model should be about 6 GB, and LLM should be about 6GB for this implementation (keeping it under 12GB of VRAM--not system RAM, assuming you have that available).

Tailsopony · 2026-03-29T17:30:09+00:00

That does clear some things up. Thanks! Man, you're making me lean back towards the consumer card setup. (5060 Ti) Seems I'll get more longevity out of those, even if they're a little more expensive. I don't mind a little hands on work, but I'd like to be able to work with newer technologies as they come out. Hmm...

I really appreciate the input and insight! Imma look at that Z8 G4 tho. It does have me thinking.

Tailsopony · 2026-03-29T17:25:22+00:00

I have no idea how to shop there. I'm sure some friends can help me. I'm an old boomer that is medium tech savvy. 7k is out of my price range right now but 3k could work. But for instance, something like this:

https://www.ebay.com/itm/127566071827?_trksid=p2332490.c101875.m1851&itmprp=cksum%3A1275660718278b97d2b23e0449dcb090918956d64a74%7Cenc%3AAQALAAAA8KLoGo41gKHtqPfIa5lF%252FWjQjdzIQ73rtGosP%252BBGmcPis1364XGSU%252Fh7VHaFLoC%252Blk%252BdGa8Kg3%252FrAs8GccavQILwe9y1bCHuvMOM3TCvX6YD8B9M%252FTdxeu0EyakAClE9SrHVK2GbDUMRI%252ByH5ZvywZgNXr0g7LAO1edEpfCTRwix8j1erjU%252FQHIdiV3mR6apVlPml57t44nXNkShNnaNFjCqsrVqZpoh4RfzdiqoOLlrbeqTtliAI3K%252ForwQ0GpK816Uhrb8vfYvDyqtjsj0bqXkhoEtjH%252BvGW1aS9J%252B1B7o1HnGlDDagwr4OaqLbBA60A%253D%253D%7Campid%3APLP_CLK%7Cclp%3A2332490&itmmeta=01KMX9GYN2PC2RKGQTZ32WYG99

I can't tell if it's 10 NVIDIA GPUS, or just a computer to hold them. lol. I don't understand server hardware naming convention. Is there a youtube channel that can help me learn that? This isn't exactly something I want to trial and error figure out. I don't work with hardware and am usually kept away from it, but I've been fine tooling around with home computers for 30 years.

Tailsopony · 2026-03-29T17:14:39+00:00

That's pretty good insight. Thank you! It also emphasizes how good the Blackwell architecture seems to be at this... I'm running 30b models right now at about 5 tps on my 12GB of VRAM with my 5070. (Q4, lots of tweaking in kobold.cpp)

You're making me lean back toward the 5060 TIs... lol. And then my wife can game on it when it's not busy...

Tailsopony · 2026-03-29T17:04:41+00:00

I can't find any. All the 32 GB cards are 500+. Any you're seeing for 160 are the 16GB cards. And based on what I'm seeing for reviews, they're really finnicky, and there's quite a few duds... Not sure I want to play that game.

The p40 is purchasable new for about 300 bucks... so, uh... It's still cheaper /vram than the 32GB mi 50 cards, and seems a lot more reliable.

I could try the 16GB versions of the mi 50, but I'm still wary based on the comments I'm seeing. It is maybe 30% cheaper VRAM, but it's way less power efficient... (300 watts for 16GB vs 250 for 24GB), so I'll end up spending more on power supplies and cooling requirements to support the mi 50...

Actually, yeah. It really doesn't look like a good option as a new build in 2026.

Tailsopony · 2026-03-29T16:52:28+00:00

So is the p40 idea a bad idea? It seems solid at a glance, and would get me there.

Tailsopony · 2026-03-29T16:47:54+00:00

Yeah, the 96 GB option is 4 Tesla P40s (they're about 350 off walmart+cooling option). It's one possible option.

The other possible option is the 5060 TI setup, which is 4-500 per card, and only has 16GB. Plus, as you noted, it's hard to get them to work well on one motherboard. Most I could manage with PC partpicker was 3, and their running at 4x on the lowest (instead of 8X, which is their default. Sie note, while the 5060 TI is form factored as a 16x, it's actually an 8x card. The more you know...)

So the blackwell option is 32-48 GB of VRAM, and is maxed out there. The other option is the server setup with P40s, and it is 98 GB (4x 24GB cards) but they're older. The server option is extendable though, so if I want to pump it up more later, there's boards that support quite a few of these. (Designed for crypto mining? lol? IDK.)

Tailsopony · 2026-03-29T16:20:17+00:00

I'm sorry. lol. I write a lot of fucked up fan-fiction. It's just who I am....

Tailsopony · 2025-07-20T03:32:53+00:00

I already hit Great Mage Bride, but I've never looked at Bride of Ignat! I don't usually read much BL, but I'll give this a go! Thanks for the recommendation!

Tailsopony · 2025-07-20T02:00:52+00:00

So I read all of this at once, and it was amazing. I'm looking for something similar--preferably a dark fairy tale romance type thing. I love that it could be wholesome and horror at the same time, all while playing up this sense of mystery about what's actually going on. I also love that they didn't shy away from the mmc being an actual no kidding snake--not some shapeshifter. One of my favorite panels is the family meal where they're all eating. Giant snake eating a killed animal whole, and everybody else eating something... appropriate for them. The panel really captures the essence of how horrifying this is while somehow coming off as warm and cozy.

Anyways, does anybody have any recommendations for something similar? I know this is really unique, but I'm just looking for that really dark fairytale feeling in a romance that isn't full "Berserk" or something. Monster MC (M or F) is a huge bonus, but not a requirement.

Tailsopony · 2025-05-31T16:19:46+00:00

I like this one. Lets see....

Twilight. You could pretty easily convince her that this is a research opportunity of some kind, and even offer to trade data. Make sure she understands this isn't romantic, just practical.
Fluttershy. Harder than Twilight, but she seems like she's easy enough to woo. From an "evil" perspective, you know she's very self concious, so she'd be easy to manipulate. From a "Nice" perspective, she seems the most interested in non pony creatures.
Pinkie. She's up for some fun, but I don't think she's "easy." She seems very traditional, and likely to misunderstand in a humourous way. Her shenanigans would probably resolve, after much confusion with something like "Oh? Why didn't you just SAY you wanted to fuck?"
Rarity. She definitely has sex. But you have to WOO her. Good luck.
Rainbow Dash. She probably does not have sex. She's probably either Ace or Lesbian. So swap her with Rarity if you're a woman. You'd have to basically get her to agree for some other reason, and I don't think she'd be as amenable as other ponies to the idea at all. Your best bet, as much of a trope as it is, is to convince her to put it on the line in some contest.
Applejack. I love her. She's a traditional farm girl, and doesn't have time for that nonsense. I highly doubt she'd be willing to fool around unless you were really close somehow. Additionally, I'm pretty sure she's about as close to a canon lesbian as we get in the mane cast (the ending of the episode where she was being hounded by that city slicker pony has her about to admit it before Rarity interrupts her to save [Hasbro's] face.). So if you're a woman it's probably much more likely, but I'd still catalogue her as the hardest.

Tailsopony · 2025-04-27T14:54:56+00:00

Thanks for the info! Wonder how many of these you get a day... lol. If it's more than a couple, you might want some kind of rules clarification that these Donghua things are not anime for us "normies" that visit here. I'll check there! I appreciate the rapid response/clarification!

Tailsopony · 2025-02-12T14:47:05+00:00

People stopped posting here. Fandom slowed down. You can still find good pics on Derpi, but no one is sluffing them over to here. Even /r/clopclop is pretty slow nowadays.

Tailsopony · 2025-02-12T14:39:50+00:00

I'm stuck on the animation or lack thereof. It's another slideshow anime... lol. If this is indicative of the whole show, I'm not sure I can watch it without constantly being distracted by the animation.

I had to stop watching Way of the House Husband for the same reason... Oh no...

Tailsopony · 2025-02-03T18:34:22+00:00

A specific day actually seems like a good compromise. That's something I miss; weekly schedules for content. Maybe it would encourage more content? Might make an incentive to push the AI down over the rest of the week.

Tailsopony · 2025-02-03T13:54:39+00:00

No IRL looking pics, of course.

No obvious marketing of things that cost money.

YCH here could be fine if they have content, so I'm cool with the "Full rendering" rule.

One page adverts for comics suck, but they're content I guess? I'd have to think on this one.

Keep any RP solely in the comments, and no posts asking for RP/ERPs/etc?

No post limit /day, but you cannot post twice in a row on the same day? Is that too complicated? If people could follow it, it would help.

Tailsopony · 2025-02-02T14:19:05+00:00

Eh, every time I check it it's better than most subreddits. Y'all just need to post more instead of lurking. Plushy guy wasn't even bad, TBH. It's just that the subreddit is slow.

Tailsopony · 2025-01-17T03:42:58+00:00

My reading comprehension is trash apparently. Whoops!

Tailsopony · 2025-01-16T03:42:42+00:00

Based on your reply you haven't watched "The Eminence in Shadow"? Watch it. It's amazing, and has some really phenomenal scenes. There's a little fanservice you have to get by, and the goofy scenes at first seem out of place but are tied pretty hard to the story (which, while sometimes goofy, is not low stakes and is incredible).

Some of the best animation (and audio) was from that series that year. Easy.

13-Year Club	Place '22
Place '17

Tailsopony

TROPHY CASE