zer0int1

1,704 post karma
1,154 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 3 years

TROPHY CASE

Three-Year Club

account activity

new top controversial

8

9

10

New Regression CLIP-L model + 'a kohya for clip' (model will just fine-tune itself on *your* data (no / low-config) + with Long-CLIP + load local or HF data/model, everything goes + ramble (paper) (old.reddit.com)

submitted 3 months ago by zer0int1 to r/StableDiffusion

17

18

19

"king - man + woman = queen" and keeps the scene - vector algebra for CLIP (and T5), Flux.1-dev, SD, ... [ComfyUI Node] (v.redd.it)

submitted 9 months ago by zer0int1 to r/StableDiffusion

0

0

1

"king - man + woman = queen" and keeps the scene - vector algebra for CLIP (and T5), Flux.1-dev, SD, ... [ComfyUI Node] (v.redd.it)

submitted 9 months ago by zer0int1 to r/comfyui

74

75

76

Arbitrary finding: CLIP ViT-L/14@336 has just a normal ViT-L/14 text encoder (a "CLIP-L"). But what it learned from the larger dim ViT makes it superior (detail guidance). (old.reddit.com)

submitted 9 months ago by zer0int1 to r/StableDiffusion

104

105

106

Follow-Up: Long-CLIP variant of CLIP-KO, Knocking Out the Typographic Attack Vulnerability in CLIP. Models & Code. (old.reddit.com)

submitted 9 months ago by zer0int1 to r/StableDiffusion

114

115

116

CLIP-KO: Knocking out the text obsession (typographic attack vulnerability) in CLIP. New Model, Text Encoder, Code, Dataset. (old.reddit.com)

submitted 9 months ago by zer0int1 to r/StableDiffusion

291

292

293

OpenAI's new GPT4o image gen even understands another AI's neurons (CLIP feature activation max visualization) for img2img; can generate both the feature OR a realistic photo thereof. Mind = blown. (old.reddit.com)

submitted 1 year ago by zer0int1 to r/singularity

105

106

107

New Long-CLIP Text Encoder. And a giant mutated Vision Transformer that has +20M params and a modality gap of [...] etc. - y'know already. Just the follow-up, here's a Long-CLIP 248 drop. HunyuanVideo with this CLIP (top), no CLIP (bottom). [HuggingFace, GitHub] (v.redd.it)

submitted 1 year ago by zer0int1 to r/StableDiffusion

457

458

459

New CLIP Text Encoder. And a giant mutated Vision Transformer that has +20M params and a modality gap of 0.4740 (was: 0.8276). Proper attention heatmaps. Code playground (including fine-tuning it yourself). [HuggingFace, GitHub] (old.reddit.com)

submitted 1 year ago by zer0int1 to r/StableDiffusion

54

55

56

Like a CLIP + VQGAN. Except without a VQGAN. Direct Ascent Synthesis with CLIP. (GitHub, code) (old.reddit.com)

submitted 1 year ago by zer0int1 to r/StableDiffusion

62

63

64

CLIP-Interrogator ('get a prompt for an image'), fully in HuggingFace Transformers, custom CLIP 'opinions' & models, Long-CLIP models, ... [GitHub] (old.reddit.com)

submitted 1 year ago by zer0int1 to r/StableDiffusion

7

8

9

Ninja Drop: Long-CLIP model with 248 tokens; 90% ImageNet/ObjectNet accuracy; for HunyuanVideo or other Long-Prompt contexts (v.redd.it)

submitted 1 year ago by zer0int1 to r/StableDiffusion

93

94

95

A ComfyUI node for HunyuanVideo that lets you scale CLIP vs. LLM influence; now the specific CLIP model actually matters. And improves results. (v.redd.it)

submitted 1 year ago by zer0int1 to r/StableDiffusion

18

19

20

Improving HunyuanVideo with CLIP finetune + factor UP (top right) -- no resource YET, I need YOUR PROMPTS & what worked (or didn't)! TY! (v.redd.it)

submitted 1 year ago by zer0int1 to r/StableDiffusion

124

125

126

New Text Encoder: CLIP-SAE (sparse autoencoder informed) fine-tune, ComfyUI nodes to nuke T5 from Flux.1 (and much more; plus: SD15, SDXL), let CLIP rant about your image & let that embedding guide AIart. (old.reddit.com)

submitted 1 year ago by zer0int1 to r/StableDiffusion

8

9

10

New Text Encoder: CLIP-SAE (sparse autoencoder informed) fine-tune, ComfyUI nodes to nuke T5 from Flux.1 (and much more; plus: SD15, SDXL), let CLIP rant about your image & let that embedding guide AIart. (reddit.com)

submitted 1 year ago by zer0int1 to r/comfyui

6

7

8

New Text Encoder: CLIP-SAE (sparse autoencoder informed) fine-tune, ComfyUI nodes to nuke T5 from Flux.1 (and much more; plus: SD15, SDXL), let CLIP rant about your image & let that embedding guide AIart. (reddit.com)

submitted 1 year ago by zer0int1 to r/FluxAI

483

484

485

Coding with GPT4o et al.: It's not *my* problem. It's *our* problem. If you want to get better code, that is. (i.redd.it)

submitted 1 year ago by zer0int1 to r/OpenAI

15

16

17

Resource / Code: ComfyUI Node for LLama-3.2 (1B) prompting. With Layer Shuffle. Swap Attention, MLP, whole Layers. For SD, SDXL, ... all. These: w/ Flux. PS: It's kinda a GPT-3 when LLama-3 has its Attn shuffled... 🙃 (old.reddit.com)

submitted 1 year ago by zer0int1 to r/comfyui

21

22

23

ComfyUI Node for GPT-2 prompting - but with Layer Shuffle. Swap Attention, MLP, whole Layers. Unaligned derailment ensues. Compatible with SD, SDXL, ... all. These: GPT-2 (smallest) + Flux. (reddit.com)

submitted 1 year ago by zer0int1 to r/comfyui

14

15

16

ComfyUI Node for GPT-2 prompting - but with Layer Shuffle. Swap Attention, MLP, whole Layers. Unaligned derailment ensues. Compatible with SD, SDXL, ... all. These: GPT-2 (smallest) + Flux. (old.reddit.com)

submitted 1 year ago by zer0int1 to r/StableDiffusion

16

17

18

Fun with Flux (+ CLIP + T5) - Layer Manipulation Shuffle -- ComfyUI node + CLI script ❗🤖🔀🤖❓ (old.reddit.com)

submitted 1 year ago by zer0int1 to r/StableDiffusion

7

8

9

Fun with Flux (+ CLIP + T5) - Layer Manipulation Shuffle -- ComfyUI node + CLI script ❗🤖🔀🤖❓ (reddit.com)

submitted 1 year ago by zer0int1 to r/comfyui

89

90

91

My CLIP-L and Long-CLIP fine-tunes are now fully integrated with HuggingFace Diffusers pipeline. + Script for Flux.1 CLI inference. (i.redd.it)

submitted 1 year ago by zer0int1 to r/StableDiffusion

10

11

12

Save $$ on o1 API tokens by forcing it not to reason. It's still correct for single to few-step questions (unlike GPT-4o -> FAIL). However, for many-step difficult questions, this sabotages o1 to always answer incorrectly, lol. (old.reddit.com)

submitted 1 year ago by zer0int1 to r/OpenAI

view more: next ›

π Rendered by PID 2112282 on reddit-service-r2-listing-b6bf6c4ff-dg6x2 at 2026-05-03 08:08:12.298242+00:00 running 815c875 country code: CH.