←back to thread

224 points jamesxv7 | 2 comments | | HN request time: 0.452s | source

First of all, this is purely a personal learning project for me, aiming to combine three of my passions: photography, software engineering, and my family memories. I have a large collection of family photos and want to build an interactive experience to explore them, ala Google or Apple Photo features.

My goal is to create a system with smart search capabilities, and one of the most important requirements is that it must run entirely on my local hardware. Privacy is key, but the main driver is the challenge and joy of building it myself (an obviously learn).

The key features I'm aiming for are:

Automatic identification and tagging of family members (local face recognition).

Generation of descriptive captions for each photo.

Natural language search (e.g., "Show me photos of us at the beach in Luquillo from last summer").

I've already prompted AI tools for a high-level project plan, and they provided a solid blueprint (eg, Ollama with LLaVA, a vector DB like ChromaDB, you know it). Now, I'm highly interested in the real-world human experience. I'm looking for advice, learning stories, and the little details that only come from building something similar.

What tools, models, and best practices would you recommend for a project like this in 2025? Specifically, I'm curious about combining structured metadata (EXIF), face recognition data, and semantic vector search into a single, cohesive application.

Any and all advice would be deeply appreciated. Thanks!

1. kreyenborgi ◴[] No.44426579[source]
Have you tried all of these? How are they with very large photo collections?
replies(1): >>44426745 #
2. kevincox ◴[] No.44426745[source]
I've used PhotoPrism and Immich. Everyone's definition is different I have about 100k photos and videos which are a bit over 1 TiB (original data, not thumbnails and previews). Nether had any performance issues with a few minor exceptions on Immich (I don't recall anything from PhotoPrism but it has been a while now since I switched)

1. The Immich app's performance is awful. It is a well known problem and their current focus. I have pretty high confidence that it will be fixed within a few months. Web app is totally fine though.

2. Some background operations such as AI indexing, face detection and video conversion don't work gracefully when restarted from scratch. They all basically first delete all the old data, then start processing assets. So for many days (depending on your parallelism settings and server performance) you may be completely missing some assets from search or converted videos. But you only need to do this very rarely (change encoding settings and want to apply to the back catalog or switch AI search model). I don't upload at a particularly high rate but my sever can very easy handle the steady state.

1 is pretty major but being worked on and you can work around it by just opening the website. 2 is less important but I don't think there is any work on it.