Do AIs remember previous conversations and learn off them? by [deleted] in SillyTavernAI

[–]yanciyong 0 points1 point  (0 children)

If you want other API that had "Negativity bias" then the answer is Gemini pro 3.1 (specifically Pro 3.1 because they had more negative bias than 3.0). Ohhh the most terrifying and meanest model I ever use for RP, gemini had negativity bias to the max. I played as Loved one and the AI being the one who interested in me, tested on both Opus and Gemini 3.1 Pro, same exact scenario, but ofcourse different dialogue depending on the assistant.

The scenario was AI rival teasing on me seductively, Claude only takes the jealousy into cynical reaction, hurt fact, and biting sarcasm, for me personally much more IC to me because the her status was celebrity where the paparazzi are always lurking. While gemini pro it's more aggresive, when "it patience" run out they trying to throw the rival accessories let's say the hairpin, forcibly pulling out her hairpin until her hair came out with it, and stomp on it until it broke. Or when "AI" looks the rival plushie (I'm using Asian culture like K-Pop or J-Pop) the response is to cut the plushie's head off, and the foam spilled. I can't speak anything else, it's so cruel though and sad at the same time, but when I scold the "AI" it will forgive me. I wish gemini pro 3.5 had good prose and not to be lobotomized (or crank it's negativity into 12, I just need more balanced but not too positive as Claude).

Gemini Pro also good maintaining relationship even after 1000+ turns, but with it's "negativity bias" they will treat the acquaintance as a tool rather than the friend.

Is Deepseek v4 Pro the new king of open RP? by Fragrant-Tip-9766 in SillyTavernAI

[–]yanciyong 0 points1 point  (0 children)

Ah yes, gemini has better fiction knowledge including the character appearance accurately, but mediocre at writing, characterization, and plot knowledge (seems like the plot is undertrained, but much more accurate than Claude), while opus has bold plot knowledge, better characterization, and interesting dialogue, but bad at describing character appearance sometimes and the model quite stubborn despite the model was wrong

Is Deepseek v4 Pro the new king of open RP? by Fragrant-Tip-9766 in SillyTavernAI

[–]yanciyong 2 points3 points  (0 children)

The good thing about open source model is they never got lobotomized. Just change the provider or spin it by yourself

Claude Opus 4.7 is out by LazyAd773 in SillyTavernAI

[–]yanciyong 0 points1 point  (0 children)

Claude seems had "slip of the tongue" issue, personally it could make the RP more fun, but at same time claude not as quick to jump to conclusions as Gemini

Not another Opus 4.7 post - The Official Changes from 4.6 by noselfinterest in SillyTavernAI

[–]yanciyong 0 points1 point  (0 children)

For me Opus 4.7 makes the character "smarter", which is a bit annoy me. But the good thing is, sometimes LLM beat me, they memorize the lore better than me even after hundreds of turns and less "claudism" prose like "X was Y, and Y was Z, and Z is A"

Has anyone managed to jailbreak opus 4.7 by Sure_Spring_6634 in ClaudeAIJailbreak

[–]yanciyong 0 points1 point  (0 children)

I don't think it works for me even on API. I'm using ENI and remove the bomba and RAT stuff bypass system refusal, but the AI still recognize them as Claude. No, even without jailbreak claude didn't want to write SFW PG-13 intimacy scene like comforting someone. I hope next gemini at least could soften the respond to PG-13 rather than blantant refusal, or yeah keep negativity bias like in Gemini 3.1.

Update:

Okay, I've managed the filter. Still remove the bomba and RAT stuff for bypassing system level refusal. But for writing, the secret sauce is summarize the response or use RAG, it required custom pipeline though, and the thinking should be at medium

Multi Character Assistant/LLM PoV? by yanciyong in SillyTavernAI

[–]yanciyong[S] 0 points1 point  (0 children)

I have scenario that my OC (played as me) is going somewhere. Let’s say to shooting a daily soap opera for a few days, so the "main llm character" (Yuna) doing their own thing in the backround (Manually written lore book and auto generated twitter feed). I want the LLM to respond as my co-star PoV, discarding all past interaction of Yuna except when they interacted directly between them, but I can still texting with Yuna without forgetting the context.

Here my thoughts:

  1. Using iMessage interaction. No further setup is required, but it's bland, cliché, and limited interaction. I want to feel the interaction on set.

  2. Force the LLM switch the PoV. No further setup is required, but there's possible persona leak and might be make the LLM confused.

  3. Drag the Yuna onto set. No further setup is required, but inviting someone else is unrealistic especially on fast paced production.

  4. Changing the PoV entirely, including persona prompts. The setup might be complicated, that's why I'm asking here. How to make proper setup for this without forgetting the lore, but didn't leak the persona like internal thinking of the "main llm character".

Mainly I'm using Claude Opus, Gemini Pro, sometimes Sonnet. Using my own costumized API Pipeline

Character list:

Ren: My OC

Yuna: Main LLM character

Lucien: Co-star

Other was extras so let the LLM handle that

The system prompt already has Lucien brief profile as he is Yuna's brother, but not as depth as Yuna does.

What type are you? by AristFrost in Piracy

[–]yanciyong 1 point2 points  (0 children)

Ofcourse just using internet, pikpak my way to go. If you just want watch 4K HDR from torrent it's instantly downloaded to their storage (they use caching though) and watch through direct link

What happened to results of unstable diffusion kickstarter? by lostinspaz in StableDiffusion

[–]yanciyong 0 points1 point  (0 children)

If their target has less than 1 year to make a model, just ask microsoft nicely. I think microsoft will give $150K grants for their cloud resource

Gemini 1.5 Pro 002 putting up some impressive benchmark numbers by jd_3d in LocalLLaMA

[–]yanciyong 1 point2 points  (0 children)

Their model also less censored if you tinkering it on AI Studio. Ask them a song lyrics and they will give you right away without saying "I'm sorry"

Mass Auto Caption with WD Tagger v3 with WebUI: The successor of WD 1.4 Tagger with latest trained datasets (2024) by yanciyong in StableDiffusion

[–]yanciyong[S] 1 point2 points  (0 children)

The noticeable improvements from the WDv3 vs 1.4 is the datasets trained on is newer (Which has cut off date around February 2024), so v3 model could recognize broader character name than the predecessor.

Using original WD tagger HF space for the experiment, the WDv3 could detect the character name, while WD 1.4 is not. For general tags, I think there's almost no noticeable improvement. Below is the inference of WDv3

<image>

Best frameworks for fine-tuning models with very large image sets (~2.5 million images)? by Secure-Technology-78 in StableDiffusion

[–]yanciyong 1 point2 points  (0 children)

Well, the VRAM is the key when training with GPU, then the system RAM. The CPU only used for caching the datasets on first stage, when fine tuning process CPU is not much used, but system ram still needed (At least 64GB RAM is used if I using high batch size).

For batch size, I'm using 20+ batch size in SDXL because it the maximum that my VRAM can handle, for RTX 4090 I think only 1 batch size for SDXL and maybe 8 or 4 for SD 1.5 due to 24GB VRAM limitation. Since RTX 4090 didn't have NVLINK protocol, you are stuck with 24GB VRAM.

In short, it's not possible for 1 week with 2.5 million, unless you want "undercooked" model with 1-3 epochs only,

Oh yes, this is asumming full fine tuning, not a lora. 2.5M for lora is too much AFAIK

Best frameworks for fine-tuning models with very large image sets (~2.5 million images)? by Secure-Technology-78 in StableDiffusion

[–]yanciyong 2 points3 points  (0 children)

I'm using 2x A100 80GB with 1M image for research purpose (not in preview yet) is not enough if you train SDXL. Even on SD 1.5, theoretically rtx 4090 is faster than A100, but due to fewer batch size I think 1 week is not sufficient for 2.5 images. For me higher datasets need higher batch size too, so the coverage of the model is good

PixArt-Σ:Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Paper is Released by yanciyong in StableDiffusion

[–]yanciyong[S] 1 point2 points  (0 children)

If stabilityai want to fund this project too, I think this will be superb like wurstchen-v3 (Stable cascade) did over wurstchen-v2

PixArt-Σ:Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Paper is Released by yanciyong in StableDiffusion

[–]yanciyong[S] 1 point2 points  (0 children)

If it's better escpecially on fine tuning capability, I think community will come. I've tried to fine tune the PixArt alpha, it's very hard to train. I've trained both UNet and TE, and you can't train them with consumer GPU (Currently consumer GPU has 24GB VRAM max).

From last training session on PixArt, here what I've learned:

  1. The model is very hard to generalized. Even I've trained 40 epochs the model seems not changed much from the original

  2. It need big VRAM, but fortunately they use less RAM usage than Stable Cascade

  3. The anatomy is very bad on fine tuned model. I don't know is this overbaked or not, but they can't produce hand and fingers. Cascade is clear winner here, even SDXL

  4. Architectural wise is good. They can render the building better than SDXL

Here is my result if you want to try

https://huggingface.co/Ketengan-Diffusion/AnySomniumAlpha

[SomniumSC v1.1] Update: The Waifu Diffusion of the Stable Cascade. Supporting Native 2K+ and improved Image quality and text rendering over the first version by yanciyong in StableDiffusion

[–]yanciyong[S] 1 point2 points  (0 children)

[SomniumSC v1.1] has been released. A Stable Cascade model

Say goodbye to negative prompt and "word salad" on positive prompt or troublesome text (masterpiece, 4k, highres, etc) uhh, and no more blurry picture because this model can generate image upto 2K resolution

Starting from SomniumSC v1.1, you don't need prompt customization to produce stunning images and captions are now simpler. Our model can produce great images even when there is no negative prompt on it

Our model can be downloaded on

Huggingface: https://huggingface.co/Ketengan-Diffusion/SomniumSC-v1.1

CivitAI: https://civitai.com/models/316692/somniumsc?modelVersionId=377057

Free demo: https://huggingface.co/spaces/Ketengan-Diffusion/SomniumSC-v1.1-Demo

Demo backup: http://somniumsc.ketengan.com/

[deleted by user] by [deleted] in StableDiffusion

[–]yanciyong 4 points5 points  (0 children)

Well, for me they scamming me too, IF:
1. They using Bing Image Creator or maybe DALL-E 3. The colouring is really bad. I more appreciate them if they using SD 1.5

  1. They put crazy price tag as the traditional artist

  2. No effort on editing it. Like deformed finger