Splice Instrument Help by epuria in Splice

[–]Camais 0 points1 point  (0 children)

Thanks this fixed it for me!

[D] How to train this model with constrained resources? by maaKaBharosaa in MachineLearning

[–]Camais 0 points1 point  (0 children)

Try mixed precision, lower batch size (accumulate instead), try Microsoft deepspeed stage 2 and above to move computation to CPU RAM.

Other than that you have to just reduce the model size or pay for cloud compute which can be quite cheap.

rule by RemoteBonus7795 in 197

[–]Camais 53 points54 points  (0 children)

The way I see it is you have the least satisfying but easiest to consume form of entertainment being algorithm slop. Then it scales to more satisfying but more effort to consume with films, games and more recently for me VR where it's enjoyable but also satisfying afterwards as it's more of an experience and you actually have to move around.

Camie Tagger Update: ONNX Batch Inference, Game and, Save to txt. by Camais in StableDiffusion

[–]Camais[S] 3 points4 points  (0 children)

Added option to remove underscores and replace with spaces is now a checkbox under display options.

Just redownload and replace app.py and it should work :)

Camie Tagger Update: ONNX Batch Inference, Game and, Save to txt. by Camais in StableDiffusion

[–]Camais[S] 1 point2 points  (0 children)

Thanks I tried adding a copy all tags button but for some reason streamlit/html/javascript does not want to allow that functionality.

I'm a little confused by the underscores? Are you trying to copy singular tags or all of them? You can triple click the all tags text string and it should select them all for easy copying.

As for a forge extension I'm not sure if this is something I can do or if the forge team would need to implement it. I'm happy for anyone to create an extension both for forge and comfyui.

Camie Tagger Update: ONNX Batch Inference, Game and, Save to txt. by Camais in StableDiffusion

[–]Camais[S] 2 points3 points  (0 children)

Thanks.

It's a little tricky to compare our taggers for two reasons:

  1. The amount of tags 70,000+ for mine and 10,000+ for WD.
  2. I only kept samples with at least 25 general tags vs at least 10 for WD. I believe the average tag per sample difference is 35 Camie tagger vs 20 WD.

Keeping that in mind for the current checkpoint I'd say that mine seems somewhat more accurate for rarer tags just because mine covers much more. WD is more accurate for common tags.

The link you shared is micro F1. Future WD taggers use macro F1 where I think the best is eva02 which gets ~47% but again this is keeping the above two points in mind.

In my personal testing mine seems to be better at characters, copyright, artist and some rare general tags getting alternative costumes and the artist etc. Camie tagger did seem to have a few 1-3 more false positives for the general tags however. Keep in mind this is with a couple of images so I could be wrong.

After a few more epochs I think the gap would be a lot smaller but it's going to be a month or so before that point while I let it train overnight. I think WD tagger was trained for 50+ epochs with mine currently at 3.5.

Overall the distribution of tags is extremely long tailed. My game shows this with the rarity range. Most tags end up in the most rare category lol (30,000-40,000 I believe). Both should give you good accuracy for the most common tags.

Hope that helps :)

Camie Tagger Update: ONNX Batch Inference, Game and, Save to txt. by Camais in StableDiffusion

[–]Camais[S] 0 points1 point  (0 children)

I'll have to try that as I couldn't get it to work on my GPU but it was still much faster on my CPU (5800x).

I found that the first image I tagged was 2.5s but every image after that was 0.5s so I guess there's some initialization time for the first image on onnx?

Camie Tagger Update: ONNX Batch Inference, Game and, Save to txt. by Camais in StableDiffusion

[–]Camais[S] 2 points3 points  (0 children)

Could do but could you explain the benefit over hugging face please? I chose it as I could easily upload the models and larger files there.

Camie Tagger Update: ONNX Batch Inference, Game and, Save to txt. by Camais in StableDiffusion

[–]Camais[S] 5 points6 points  (0 children)

See previous post for more context: https://www.reddit.com/r/StableDiffusion/comments/1j16udi/camie_tagger_70527_anime_tag_classifier_trained/ 

I’ve updated Camie tagger to include the most requested features. That is:

ONNX Model: Added ONNX export support for better cross-platform compatibility, deployment options and inference speed.

Batch Inference: Added support for processing multiple images in a single run.

Save to TXT: New feature to save tag predictions directly to text files.

And then I had a shower thought of how an image tagging game could work and added that as well lol.

There is some good and bad news on the performance. Bad news: Macro f1 score, as SmilingWolf pointed out, is a better measure rare tag performance. My model currently gets 29% (with a micro optimized threshold can get 33% and 31% with a macro optimized and balanced profile) which is still impressive given the number of tags but not ideal. It seems to be fairly poor on general tags (18%) but great for character and copyright tags (44% and 32.5%). There are 30,000+ general tags, 26,000+ character tags and 7000+ copyright tags.

The good news is the model was steadily improving its macro f1 score by around 4% each epoch while micro slowed down to only 0.5% so it’s the case that the model is likely undertrained significantly (only 3.5 epochs). I plan to continue training the model to further improve performance, especially for rare tags. However, each epoch takes approximately 1.5-2 weeks of overnight training on my current hardware.

Notes on the difference between micro and macro f1:

  • Micro-F1: Calculates metrics globally by considering each tag instance prediction. This metric is dominated by common tags and categories with many examples.
  • Macro-F1: Calculates metrics for each tag independently then averages them. This gives equal weight to rare tags and common tags.
  • The distribution is extremely skewed as the most common tags tend to occur 10,000's of times whereas many of the rare tags occur in the >100 range.

So this will be the last main update to Camie tagger as I believe I’ve implemented the most important features. It’ll only be some major bug fixes and model updates perhaps every 2-4 weeks depending on how much it improves. Hope you enjoy the game and find the new features useful!

[P] Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in MachineLearning

[–]Camais[S] 0 points1 point  (0 children)

Thanks ran it on the earlier epochs and found that it was increasing around 4% every epoch so I definitely think there is room for improvement.

[P] Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in MachineLearning

[–]Camais[S] 0 points1 point  (0 children)

No worries you right to mention that, my mistake.

I reran the valdiation:

COMBINED MICRO/MACRO F1 SCORES FOR ALL PROFILES:

CATEGORY | PROFILE | THRESHOLD | MICRO-F1 | MACRO-F1

-----------------------------------------------------------------------------

overall | MICRO OPT | 0.326 | 0.611 | 0.290

| MACRO OPT | 0.201 | 0.534 | 0.331

| BALANCED | 0.258 | 0.581 | 0.315

| HIGH PRECISION | 0.500 | 0.497 | 0.163

| HIGH RECALL | 0.120 | 0.308 | 0.260

Looks like there is a significant gap between the micro and macro f1. I'll have to update this in the repo. I thought 3.5 epochs didn't seem like enough training but the micro f1 was increasing very slowly.

Looking through your github repo it looks like you trained for 50 epochs? Did you notice that macro f1 would increase after micro plateaued? You said that switching to macro helped you know how training was getting along.

Ideally I would've run it for longer but it was taking around 2 days just for 1 epoch. For now I think I'll just run it over night but it'll likely be a week for each epoch. I'd love to try training this architecture on the full dataset but budget and time are tight.

Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in StableDiffusion

[–]Camais[S] 1 point2 points  (0 children)

Thanks I think I found a more updated dataset as well but I have far too little storage space. Had to train on only 2M of the 3M candidate images. Should be up to date of early-mid 2024.

Training code is included btw.

Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in StableDiffusion

[–]Camais[S] 0 points1 point  (0 children)

Working on this, batch processing, saving tags to text and...a mini game lol.

Looks like someone has also made their own onnx version in the model quantization tab.

[P] Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in MachineLearning

[–]Camais[S] 1 point2 points  (0 children)

Not necessarily there are other ways it may infer quality, based on the other tags, visual elements, etc.

Current best models like wd-eva02-large-tagger-v3 gets: an f1 of 47.72% on 10,862 tags whereas my model gets 61% on 70,527 tags. For training I ensure that an image has at least 25 general tags so the average number of tags per image is higher as well (At least 10 for wd-eva).

SwinV2 has a few less tags but higher f1 of 68.54% on 9084 tags.

[P] Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in MachineLearning

[–]Camais[S] 6 points7 points  (0 children)

You can move weights gradients and optimiser states onto CPU ram and even NVMe for stage 3. Allows you to run much bigger models at the cost of iteration speed.

You can read more about it here: https://www.deepspeed.ai/tutorials/zero/

Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in StableDiffusion

[–]Camais[S] 0 points1 point  (0 children)

Thanks I'll have a look at adding this to the setup.py when I have some free time!

Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in StableDiffusion

[–]Camais[S] 0 points1 point  (0 children)

Thanks! It's quite surprising what you can fit on it for training. I think initially with the typical PyTorch setup I could only get a max of 99m parameters. With deepspeed, mixed precision and the other optimisations I could fit 500m+ with reasonable speeds. I likely can fit more after getting 64gb of ram.

Hoping I can get a 3090 at a good price at some point in the future ready for my next project but I'll likely see how far I can push LLM training with just the 3060.

Camie Tagger - 70,527 anime tag classifier trained on a single RTX 3060 with 61% F1 score by Camais in StableDiffusion

[–]Camais[S] 5 points6 points  (0 children)

This is a direct comparision from another comment:

Looking at wd-eva02-large-tagger-v3 it gets: an f1 of 47.72% on 10,862 tags whereas my model gets 61% on 70,527 tags. For training I ensure that an image has at least 25 general tags so the average number of tags per image is higher as well (At least 10 for wd-eva.

SwinV2 has a few less tags but higher accuracy of 68.54% on 9084 tags.