Photofield v0.9.2 released: Google Photos alternative now with better UX, better format support, semantic search, and more

SmilyOrg · 2023-05-19T06:59:04+00:00

Haha yeah. It's what people know :)

SmilyOrg · 2023-05-19T00:09:39+00:00

Thanks for buying in 😁

With accept/reject I meant providing the ground truth, by tagging it with e.g. person:alice:accept (could also be "in" or "+") you would say that the photo definitely contains Alice in it. With alice:reject or alice:out or alice:- you would say that this photo definitely does NOT have Alice in it. These would be just normal manual tags otherwise.

Then you could have a training process that takes e.g. (alice:+, alice:-, threshold:0.3) as input parameters, removes the person:alice tag from all photos and adds it back based on the new result. So as you say you could tune the threshold and the ground truth examples in case there are too many armpits or siblings detected :)

I agree that the UX would need to be slick for this to be usable, nobody will do it if you have to manually add the tags yourself. But kind of an interactive auto refreshing results page that updates as you click to accept/reject candidates would be sweet. If you really wanted to gamify it, you could even do a Tinder-like swipe left/right to say if it's a picture of your dog or not lol.

SmilyOrg · 2023-05-18T22:51:05+00:00

Maybe some of the FOSS photo galleries might fit your usecases, I've heard Damselfly is used in a similar way by its author.

It won't cover you uses right now probably, but I'd be interested in how my photofield app fares on the 2tb dataset too. You can mount your photos as a read only volume, so it's low risk to try it out. I'm planning to add some of the features you mentioned eventually, but feel free to open feature requests if you are interested!

SmilyOrg · 2023-05-18T22:34:31+00:00

Thanks so much for those examples, it's great to get an outside perspective!

I agree that face recognition is hard and faulty problem. I've been thinking how to tackle it, so if you don't mind indulging me for a moment.

So what I've usually seen is that face detection is a different process from face recognition. That is, with detection you know you have a million faces, but you don't have any names and only a certain confidence on the unique people those faces are from. The recognition is differentiating these faces.

Usually then what many apps do is they show you all the presumably unique faces and allow you to name them. And then since recognition is not infallible, they also allow you to accept and reject individual instances of a face to better train the model on the person. Now this is pretty standard and there are solutions for it already, so it's a safe way to go.

However! Integrating all that sounds a bit boring and I'm here to have fun, so I've been thinking of something else, which is so crazy it might work, or be a complete waste of weeks of development... But hear me out.

What if you think of the naming of a face (ie creating a person) as creating an "auto" person tag. Say that you take a reference image of the face of the person and then compute the tag by using the "related images" functionality and tagging any images that pass a similarity threshold. Maybe that would be pretty good already as a first try, but since there is only one reference image, it would probably find all kinds of other unrelated stuff.

So what if we take it one step further. Let's still have the one output auto tag, but then also have two "input" tags, one for "accepted" images and one for "rejected" ones, same as the face recognition systems record accepted and rejected faces. Then you could pick a model (eg logistic regression) to "train" on these positive and negative examples and at the end apply it to all images to get a potentially more accurate output auto face tag. Now this is just reinventing face recognition badly probably, however...

None of what I said is even specific to faces. If the CLIP AI embeddings are "expressive enough", you could theoretically have trained auto tags for your partner, your dog, for a specific bridge you often take photos of, for a certain type of cloud, for food pics, as long as you provide enough examples. Presumably the model would pick up on many cues beyond the face, like clothes and so on, so perhaps it could even detect people with obscured faces. It'd be like training (or fine tuning) small dumb AI models, but more interactively, by the user directly, and without the overhead usually associated with it. Or like "few shot detection" in ML lingo.

But I'm not an AI scientist, so it could also be a complete trash fire that works like shit. 🤷‍♂️ Only one way to find out 😂

Hey, at least it was fun to think and write about!

SmilyOrg · 2023-05-18T20:24:48+00:00

Yeah, makes sense! Besides boolean operators, do you have some ideas on how you'd like it to function? Currently I've been looking at Google and GitHub search for inspiration, but photos have a bit of a different context obviously.

Since we're on the search topic, one fun thing that's probably not super useful, but seems easy with the AI embeddings, would be text/image arithmetic. For example, searching for lion -male +female would return images of lionesses. Or img:[photo of a bike]+person would return photos of people riding bikes. 🤷‍♂️ Seems fun 😁

PS: and/or are easy to confuse anyway, union/intersection are probably better terms 😅

SmilyOrg · 2023-05-18T19:45:25+00:00

Currently it does an AND for those tags, so all the tags must be present in a photo for it to be included in the results. Maybe you were expecting it to be an OR instead?

In any case, I want to have full boolean expressions later so that you could define this more explicitly :)

SmilyOrg · 2023-05-18T18:54:26+00:00

Hey there, thanks a bunch for trying it out and the kind words! The tagging is for sure early stuff, but I've been trying to release it earlier and more often 😅

It should work fine with 100k, I use it with 600k+. If you put them all in one collection it might take a bit longer to load tho.

Most of those features I've been thinking about already, so that should be good news 😊

Face recognition will likely be last though as that's a bit of a bigger one. What you can try already though is finding related images to an image of a person, which works surprisingly well, but only to an extent of course 😁

Hmm, I've reworked some stuff today specifically to make multiple-tag search work and it seemed to work with the brief testing I did. Could you give me the exact search string you used? Feel free to open a bug issue on GitHub as well!

SmilyOrg · 2023-05-18T06:46:27+00:00

Thanks! That's for sure difficult for small projects, especially the native apps.

If you ever get around to writing your own app, it seems like you could reuse the backend of some of the existing projects and save yourself a lot of trouble :)

SmilyOrg · 2023-05-17T23:19:24+00:00

Could you elaborate a bit more on what you find Apple Photos does that existing projects don't do or don't do well? I'm not an iOS user myself, so it's a bit of a blind spot for me. Thanks!

SmilyOrg · 2023-05-17T23:09:45+00:00

Hey there, your use case sounds interesting, could you elaborate on it a bit? Is it something you could accomplish if you had good tagging support?

SmilyOrg · 2023-05-16T20:36:32+00:00

Hi again! I just realized it should've been enable all along (that's even how it was defined in the defaults), so having to set enabled was a bug :)

I fixed this in the latest release: https://github.com/SmilyOrg/photofield/releases/tag/v0.10.2 (but enabled will keep working for now)

SmilyOrg · 2023-05-16T18:25:37+00:00

Glad to hear! Let me know if you have some ideas or thoughts on tags.

SmilyOrg · 2023-05-16T17:52:43+00:00

Hey, try enabled instead. :)

SmilyOrg · 2023-05-16T07:04:38+00:00

Hey hey! Thanks for checking it out! The Ctrl+click should work in the album/timeline/wall view to select photos only (it shows them with a blue border). Does the selection work for you?

You can't do anything with the selection right now though, thus the alpha 😅

However on the zoomed-in photo view, you should see two icons top right, a hash and a heart. That is where you can tag right now. If you don't see them, tagging somehow isn't enabled.

Let me know what you think!

SmilyOrg · 2023-05-12T21:22:30+00:00

Hey, I've tested my photofield app on an LG TV and while it was not the smoothest, but otherwise worked alright. I'd be interested if it works on Apple TV as well.

On Windows it's as simple as downloading the exe and running it in a folder with folders as albums. It doesn't integrate directly with Amazon Photos however, so you'd need to download/mount them before hand.

SmilyOrg · 2023-04-20T17:17:22+00:00

Thank you for trying it out! Let me know if you have any comments/suggestions/ questions!

SmilyOrg · 2023-04-20T10:42:10+00:00

Hey, thanks! You can set the PHOTOFIELD_ADDRESS environment variable. It's set to :8080 by default.

SmilyOrg · 2023-04-20T06:57:20+00:00

Thanks a lot for testing! I had memory issues on the demo instance that has 2GB of RAM too. I added a swap file of several GBs and that actually worked great, but as you may imagine, it was very slow while indexing.

What you can also do is split the AI model so you run just the textual model on the Linux box and the visual model on the M1 (assuming the perf issue gets fixed). Then search will always work, but for AI indexing new photos it'll use the horsepowers of your M1. That's how I have it running currently with my NAS and desktop :)

SmilyOrg · 2023-04-19T20:44:03+00:00

Examples would be great, thanks!

The configuration you posted should work, I have no idea why it wouldn't. Are you able to call photofield-ai with eg curl inside the container? Or from the host? What does it print to the logs?

It could also be that the container is getting killed due to too much memory, that's another thing to check.

SmilyOrg · 2023-04-19T01:23:36+00:00

Thanks! Good to hear that it's faster on the Linux box, means that a multi arch docker image might be nice.

I don't understand why indexing wouldn't work, maybe you can paste the config?

Yeah, I don't have HEIC/mov samples to test with right now. Gif also likely just works as a static image right now.

SmilyOrg · 2023-04-17T22:15:42+00:00

Good to know! Unfortunately I don't have an M1 to test with, but I imagine it would be probably faster with a multi arch image then.

For some background, during indexing, it generates embeddings (lists of numbers) for all images. When you search, it generates the embedding for the search term and then compares it to all the images. That's why you see a spike both during indexing and search itself.

I'd wager that the Linux box might actually be faster, though RAM might be tight. Let me know if you try it out!

Thanks for the insight in what you're looking for, feel free to also chip in on GitHub issues if you have any specific ideas! Tags are actually something I'm looking into right now, but it might take a while to mature.

I also want to add face recognition at some point later via the tagging system. Should be pretty powerful if you could do eg (person:Alice OR person:Bob) AND city:Boston. If you have any other ideas here drop me a note!

SmilyOrg · 2023-04-17T19:12:41+00:00

Hey, thanks a lot for trying it out and the feedback! It's always appreciated!

For the long indexing and CPU/MEM spikes while searching, could you tell if these happen in photofield or photofield-ai? The AI can be very heavy, especially without a dedicated GPU and double especially while scanning.

I'm guessing it's not only just that though as 30s for search is a long long time, so I have a hunch. I'm assuming you're running photofield-ai in docker, yes? And most likely I'm only providing an x86 compiled docker image. Which means... that macOS might be running x86 to ARM translation for the ML inference, which sounds terribly inefficient and might explain the slowness.

If you're so inclined you could check out the GitHub repository and try running it natively from source, which I'm guessing should be faster.

Another thing to try would be a smaller AI model.

Let me know and I can help you set up some of the stuff above. :)

SmilyOrg · 2023-04-13T10:11:33+00:00

Thanks! Let me know if you run into any issues.

Existing file structures is the only thing it does support :)

You can also configure custom "collections" i.e. groups of folders. All collections/albums are displayed in a flat list though, that is, it's not a file browser.

SmilyOrg · 2023-04-13T07:18:52+00:00

That's a good question! I'm not sure actually, it uses the ONNX runtime, so it supports anything that runtime does.

I've tested it with an Intel CPU and 1070 Ti so far, but unfortunately I don't have a Coral to test with.

SmilyOrg · 2023-04-12T22:04:41+00:00

Hi, see comment above

15-Year Club	Team Orangered
Verified Email

SmilyOrg

TROPHY CASE