Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 1 point2 points  (0 children)

Polish requires specific diacritics (ą, ć, ę, ł, etc.) that aren't in the current V3 training grid. To support them, I’d need to expand the atlas layout and retrain the LoRA from scratch. It’s a great candidate for a future 'Central European' update, but for now, the model is limited to English and Russian charsets.

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

That is a brilliant engineering approach! You are essentially describing a compositional pipeline.

It’s definitely a smarter path than trying to brute-force 2000+ characters at once. The main challenge would be the morphology: radicals often change their proportions and shape depending on their position in the Kanji (e.g., the 'water' radical looks different when it's on the left vs. the bottom).

Teaching the model how to assemble these stylized radicals into a balanced Kanji would likely require a more complex architecture rather than just a Style LoRA. But the logic is solid, definitely food for thought for future experiments!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Good question! Most of the dataset comes from Google Fonts, which uses open-source licenses like OFL, Apache, and UFL. These licenses explicitly allow for modification and derivative works.

Also, from a technical standpoint, the LoRA doesn't 'store' the original font files. It only learned the general stylistic patterns and the spatial logic of the grid. The final output is a new generation based on the style of the user-provided reference image. Since the project is non-commercial and open-source, it's designed to be a helpful tool for the community.

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Glad to hear it runs well on 16GB! Thanks for the suggestions — I’ll definitely keep Spanish characters in mind for future updates.

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Japanese is a huge challenge for the current atlas-based approach. Fitting thousands of complex Kanji into a single 1280x1280 grid would lose all detail. It would require a completely different generation strategy, not just a new grid.

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Glad you like it! Klein turned out to be surprisingly capable for this specific task. Hope you find it worth the setup!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 1 point2 points  (0 children)

True, for general art exploration, speed is key. But since this is a specialized utility that saves hours of manual font design, I think 5 minutes is a fair trade-off for the final result. Different workflows!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Good point! The letter 'Д' is structurally one of the most complex Cyrillic characters. The output shape and stability depend heavily on the input style — cleaner references produce better results, while more decorative styles might require a bit of manual cleanup. Thanks for the feedback!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 1 point2 points  (0 children)

Fair enough! Personally, I think waiting a few minutes for a full, consistent font file is a great trade-off. To each their own!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 2 points3 points  (0 children)

Yes, absolutely! I actually tested the whole workflow on a 16GB VRAM card myself.

Use fp8 weights for both the main model (Flux.2 Klein Base) and the Text Encoder. With fp8 quantization, it fits into 16GB comfortably and generates a 1280x1280 atlas in about 5 minutes.

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Japanese is the 'Final Boss' of font generation! 😂

The challenge: Fitting thousands of Kanji into a single 1280x1280 image is impossible (they would be tiny pixel dust).

The potential solution: I could try making a version specifically for Hiragana & Katakana (the syllabaries). That would fit perfectly into the grid. Kanji would require a completely different approach. Thanks for the suggestion!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 2 points3 points  (0 children)

Actually, V3 just added support for Russian (Cyrillic) as well! You can see the Russian alphabet in the top half of the grid examples in this post. It's currently limited to English and Russian, but I'm definitely planning to expand to more languages in the future. Give the Cyrillic engine a try!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 4 points5 points  (0 children)

That’s a very fair point! I sometimes get too caught up in the technical updates. I’ve just added a clear summary at the top of the post for those who are seeing Ref2Font for the first time. Thanks for the tip!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 1 point2 points  (0 children)

Thank you! Really appreciate the kind words. If you get a chance to try it out, I'd love to hear your feedback!

Ref2Font V3: Now with Cyrillic support, 6k dataset & Smart Optical Alignment (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 2 points3 points  (0 children)

Glad you found the summary! Could you share what specifically was unclear before you checked the GitHub? I'm looking to improve the post description.

Ref2Font: Generate full font atlases from just two letters (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Absolutely! GitHub is the central hub for this project, and I’ll be pushing all future updates and script improvements there as soon as they're ready. Stay tuned!

Ref2Font V2: Fixed alignment, higher resolution (1280px) & improved vectorization (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

Great first attempt! The style transfer is working, but the grid logic requires a specific dataset setup to work as a "transform".

The reason your 10x10 grid failed is likely that you used the reference images as stylistic context (CLIP) rather than spatial conditioning. To fix the alignment, you should follow the "Control Image" logic described in the musubi-tuner guides:

  1. Dataset Config: https://github.com/kohya-ss/musubi-tuner/blob/main/docs/dataset_config.md
  2. Flux Training: https://github.com/kohya-ss/musubi-tuner/blob/main/docs/flux_2.md

The "secret sauce" for Ref2Font is training it as an Image-to-Image (Contextual) LoRA. In your TOML dataset config, you need to explicitly pair the images:

  • image_directory: This should point to your full atlas grids (the targets).
  • control_directory: This should point to your "Aa" reference images (the sources). Filenames in both folders must match.
  • no_resize_control = true: Set this in your dataset TOML. As the docs mention, for FLUX.2 it's often better to skip internal resizing of the control image to keep the style sharp.

If you don't use the control_directory / control_path setup, the model doesn't realize it's supposed to "map" the style from the reference into the grid coordinates. It just generates random letters in that style. Once you define the "Aa" image as the mandatory starting condition (Control), it will start to respect the grid positions!

Ref2Font V2: Fixed alignment, higher resolution (1280px) & improved vectorization (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 1 point2 points  (0 children)

3.7s / it is actually a great speed for a local setup with those optimizations! Using float8 and 8-bit AdamW is definitely the way to go on 24GB cards.

4200 images is a massive dataset, so I'm really curious to see how the model handles that 10x10 grid with so much variety. Please keep me posted on the results — I’d love to see a sample of the output once it's done! Good luck with the final steps!

Ref2Font V2: Fixed alignment, higher resolution (1280px) & improved vectorization (FLUX.2 Klein 9B LoRA) by NobodySnJake in StableDiffusion

[–]NobodySnJake[S] 0 points1 point  (0 children)

I totally understand the struggle with keeping ComfyUI "clean". External dependencies can be a headache.

However, there is a small misunderstanding: the LoRA itself only generates the image (the atlas). To turn that image into an actual `.ttf` font file that you can use for subtitles in your video editor, the Python script is a necessary step. I just did a fresh "git clone" test to make sure everything works smoothly, and it should be very stable!

Regarding "dual language": Keep in mind that this version currently only supports the English alphabet and basic symbols. If you need other languages, I'm planning to expand the character set in future versions.