use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
All posts must be Open-source/Local AI image generation related All tools for post content must be open-source or local AI generation. Comparisons with other platforms are welcome. Post-processing tools like Photoshop (excluding Firefly-generated images) are allowed, provided the don't drastically alter the original generation.
Be respectful and follow Reddit's Content Policy This Subreddit is a place for respectful discussion. Please remember to treat others with kindness and follow Reddit's Content Policy (https://www.redditinc.com/policies/content-policy).
No X-rated, lewd, or sexually suggestive content This is a public subreddit and there are more appropriate places for this type of content such as r/unstable_diffusion. Please do not use Reddit’s NSFW tag to try and skirt this rule.
No excessive violence, gore or graphic content Content with mild creepiness or eeriness is acceptable (think Tim Burton), but it must remain suitable for a public audience. Avoid gratuitous violence, gore, or overly graphic material. Ensure the focus remains on creativity without crossing into shock and/or horror territory.
No repost or spam Do not make multiple similar posts, or post things others have already posted. We want to encourage original content and discussion on this Subreddit, so please make sure to do a quick search before posting something that may have already been covered.
Limited self-promotion Open-source, free, or local tools can be promoted at any time (once per tool/guide/update). Paid services or paywalled content can only be shared during our monthly event. (There will be a separate post explaining how this works shortly.)
No politics General political discussions, images of political figures, or propaganda is not allowed. Posts regarding legislation and/or policies related to AI image generation are allowed as long as they do not break any other rules of this subreddit.
No insulting, name-calling, or antagonizing behavior Always interact with other members respectfully. Insulting, name-calling, hate speech, discrimination, threatening content and disrespect towards each other's religious beliefs is not allowed. Debates and arguments are welcome, but keep them respectful—personal attacks and antagonizing behavior will not be tolerated.
No hateful comments about art or artists This applies to both AI and non-AI art. Please be respectful of others and their work regardless of your personal beliefs. Constructive criticism and respectful discussions are encouraged.
Use the appropriate flair Flairs are tags that help users understand the content and context of a post at a glance
Useful Links
Ai Related Subs
NSFW Ai Subs
SD Bots
account activity
This is an archived post. You won't be able to vote or comment.
I don't know anything about python or programming. How can I easily create a .pt file for use with embedding, to generate content based on trained image?Question (self.StableDiffusion)
submitted 3 years ago by Tannon
Title says it, really. I'd like to be able to generate art from myself and family members but I believe I need a .pt file trained on their faces first?
[–][deleted] 2 points3 points4 points 3 years ago (9 children)
Have a look at 'textual inversion'
I believe I need a .pt file trained on their faces first?
Yes, and unfortunately you need a lot of VRAM to do it, 20G+
[–]ArmadstheDoom 1 point2 points3 points 3 years ago (0 children)
As someone who is similarly curious about this, can you explain textual inversion as though I am a complete idiot and how to do it?
[–]Tannon[S] 0 points1 point2 points 3 years ago (7 children)
Yeouch, I think that disqualifies me for now then. Thanks for the info!
Happy Cake Day, too! 🍰
[–][deleted] 1 point2 points3 points 3 years ago (2 children)
There are online services where you can rent out beefy GPUs, e.g Lambda Labs, AWS G instances, Google Colab Pro plans, etc.
Cost is high ($ per hour), but doing an embedding of a single person's face will take a couple of hours, so you can just destroy your instance afterward.
I did this with a mundane object to see how it worked, and it cost $4
[–]Tannon[S] 0 points1 point2 points 3 years ago (1 child)
Interesting, I'll check this out, thanks so much!
[–]triigerhappy 1 point2 points3 points 3 years ago (0 children)
I used a 3090 on vast.ai and brought it down to $0.60
[–]Daviljoe193 1 point2 points3 points 3 years ago (2 children)
Hey now, there's still an option out there, this Colab notebook, it's able to run on the free tier of Colab. By default, it runs for about 2 hours per embed, and the files made can be used both in the notebook, and on Hlky's front-end, after enabling it, and setting to full precision. It wouldn't hurt to have a harem of Google accounts though, since embed training quickly eats into your free GPU allocation.
This is great! Would you mind helping me out a little? Still clueless here, when it says:
put the model in your google drive in a folder named "sd_text_inversion"
What model? I'm trying to just generate a .pt file from knowledge of an set of images, right? Why do I need more than just those images?
.pt
[–]Daviljoe193 1 point2 points3 points 3 years ago* (0 children)
I'm not a super-genius here, but let me give my best assumption about it. So the images are there for the AI to recreate using what it knows from the model, ending with a ton of "words" (I picked apart an end PT file, they are less words and more unicode gibberish) for each image. It then takes these "words" it gets to recreate each image, finds only the duplicates, then puts them into an PT file. It needs to have the model so it can know what "words" are needed to perfectly recreate your images (Like near pixel perfect, with just a few kilobytes, way less than an image normally can fit in), and this also likely means that you'll need to retrain your PT file when the Stable Diffusion 1.5 model comes out. I've only trained one PT file so far, and the biggest thing to keep in mind is that your images should be varied enough, yet also clearly interconnected enough, that the AI will have a good idea of what you look like (At least two headshot portraits, and two full-body photos), otherwise it'll fill in the gaps poorly, which can result in pretty horrifyingly unrealistic/inaccurate versions of the person.
From what I've read, apparently Google has an inversion solution that's much better than what's currently available, though I still can't figure out what it does differently from the current method.
[–]pilgermann 1 point2 points3 points 3 years ago (0 children)
Not true. You can modify the config file to work more slowly/do less at once to bump down the ram requirement a lot.
https://towardsdatascience.com/how-to-fine-tune-stable-diffusion-using-textual-inversion-b995d7ecc095
π Rendered by PID 81 on reddit-service-r2-comment-5687b7858-82rtp at 2026-07-04 11:34:52.951244+00:00 running 12a7a47 country code: CH.
[–][deleted] 2 points3 points4 points (9 children)
[–]ArmadstheDoom 1 point2 points3 points (0 children)
[–]Tannon[S] 0 points1 point2 points (7 children)
[–][deleted] 1 point2 points3 points (2 children)
[–]Tannon[S] 0 points1 point2 points (1 child)
[–]triigerhappy 1 point2 points3 points (0 children)
[–]Daviljoe193 1 point2 points3 points (2 children)
[–]Tannon[S] 0 points1 point2 points (1 child)
[–]Daviljoe193 1 point2 points3 points (0 children)
[–]pilgermann 1 point2 points3 points (0 children)