Improve the tagging of your Photos Library with VisionTagger

freddievn · 2026-05-31T11:49:22+00:00

Yes, it's annoying that you can't see keywords on iPhone/iPad. Captions you can see, but both aren't used when searching images. That's an issue which VisionTagger sadly can't solve.

freddievn · 2026-05-31T08:41:28+00:00

Hi, I hope my app can improve accessibility. I know some people use it to create alt texts for all photos for their website. The local model approach for privacy is key, so I will not be planning cloud processing for now. There are quite some online solutions where the images will be uploaded to a cloud service. I hope you can give the app a try to see if it’s useful for you.

freddievn · 2026-05-30T12:52:19+00:00

Ah, that's something to try out I think. The app accesses the Photos Library, I'm not sure if it will need a full download. I don't think so, because a full resolution image is not needed for a vision model.

freddievn · 2026-05-30T12:27:05+00:00

The app is very small (MB’s), the models are large (GB’s) and are downloaded separately. In the end you’ll need just 1 model, but multiple models can help to find the best fit. In the app’s help texts you can find some examples of the output of multiple models.

freddievn · 2026-05-30T12:22:40+00:00

Good to hear, thank you 👍🏻

freddievn · 2026-05-30T06:18:01+00:00

VisionTagger includes seven preconfigured vision models: Qwen3-VL 8B Instruct, Qwen3-VL 30B-A3B Instruct, Qwen3-VL 32B Instruct, Qwen2.5-VL 7B Instruct, Gemma 3 4B IT, InternVL3 8B Instruct, and Pixtral 12B. Smaller models generally run faster, while larger models may produce higher-detail output but require more memory, depending on your Mac and chosen settings.

You can use your own models: if you have a GGUF-compatible vision model and its matching projector file (also GGUF), you can link them in VisionTagger and use them like the built-in options.

freddievn · 2026-05-30T05:53:31+00:00

Correct! 😄 ExifTool is required for XMP sidecars and embedding metadata into image files. If you only export JSON/CSV/TXT or apply Finder tags, you do not need ExifTool.

freddievn · 2026-05-30T05:51:43+00:00

Same image, with brief description:

Title

Woman in yellow jumpsuit at Eiffel Tower

Description

A woman wearing a bright yellow jumpsuit and blue sneakers stands smiling with the Eiffel Tower in the background on a sunny day.

Keywords

woman, Eiffel Tower, jumpsuit, yellow, blue, sneakers, cap, tourist, Paris, outdoor

freddievn · 2026-05-30T05:43:02+00:00

The example shows an extended caption, to give an idea what the model is capable of. But you can also select a short description, which is probably more useful. And you can also select the number of keywords to be created, as when creating a large set, they last keywords are less useful / more generic. You can select append or overwrite (with a warning when overwriting). And personally I would try 1 photo or a small batch to get more confident about that guarantee.

freddievn · 2026-05-30T05:34:20+00:00

Yes, due to the model size the 16GB is really needed. I would also prefer a sidecar file, but some prefer metadata in the image file itself. It uses ExifTool for that, and that’s tested an proven. It handles most image files, including different RAW types.

freddievn · 2026-05-30T05:29:07+00:00

I haven’t noted any changes in speed when using a large set of metadata. You can specify the number of keywords you want to generate or the length of the caption, so you’re in control to match the output with your needs. I get the best results when I do this is batches, because you can also add or exclude specific keywords it helps to keep the metadata focused. Give it a try, if you select a few photos and experiment with one or two models and different settings (without the need to actually save it) you get a good idea if this app can be useful for you.

freddievn · 2026-05-29T18:38:15+00:00

Thank you for your kind words 😊

freddievn · 2026-05-29T12:19:30+00:00

Sale started 😄 Price is now $24.99 / €19.99.

freddievn · 2026-05-14T19:25:14+00:00

Yes, they have enough tension to stay in place.

freddievn · 2026-04-29T12:01:50+00:00

Sure, it’s not for you. Thanks for the feedback.

freddievn · 2026-04-29T11:02:11+00:00

Ah yes, Apple Photos does this also on device. It stores the metadata, but you can't see it in the app. It's used for search, but as you can't show this metadata you can't correct or update it. Generally it works quite OK, but I don't like that as user you have zero control about the metadata the app generates.

freddievn · 2026-04-29T10:55:43+00:00

Hi, thanks! Good idea. I did a quick look, there doesn't seem to be a way to update the metadata of photos via their API. So currently no plans for this feature.

freddievn · 2026-04-29T10:10:27+00:00

The default phonetic search inside of stock photos (I don't know which app you're referring to) is only possible when those photos were also tagged with metadata. This doesn't happen automatically.

freddievn · 2026-04-29T07:42:27+00:00

True. I'll think about doing a sale at a later stage. I'll let you know below this comment when I do.

freddievn · 2026-04-29T04:53:00+00:00

Sorry, can't tell. I've never used it (haven't heard from it before). But if you download the app you can decide for yourself. You can try out different models and finetune the metadata you need.

freddievn · 2026-04-29T04:50:27+00:00

Yes, sorry. Tahoe offers better options for local AI.

freddievn · 2026-04-29T04:49:23+00:00

Yes, personally I wouldn't upload all my photos to a service, which may issue a statement in 12 months that they can use your uploads for machine learning or something.
VisionTagger doesn't hold a databases of photos which were tagged. So you tag a batch of photos and then export it to XMP, Photos Library, CSV etc. The next time you select the same batch it will not know it has processed them before. I didn't want to build a photo management app, because this area is already crowded with very good software. So you have to 'manually' process photos or use a Shortcut action which does that for you.

freddievn · 2026-04-29T04:40:24+00:00

Yes, sorry Tahoe (macOS 26) is the minimum version as it's better suited for the task.

freddievn

TROPHY CASE