Google flags Immich sites as dangerous

(immich.app)

706 points janpio | 3 comments | 22 Oct 25 20:53 UTC | HN request time: 0s | source

Show context

NelsonMinar ◴[22 Oct 25 23:29 UTC] No.45676467[source]▶

>>45675015 (OP) #

Be sure to see the team's whole list of Cursed Knowledge. https://immich.app/cursed-knowledge

replies(5): >>45676661 #>>45677766 #>>45677816 #>>45678252 #>>45679167 #

levkk ◴[23 Oct 25 00:00 UTC] No.45676661[source]▶

>>45676467 #

The Postgres query parameters one is funny. 65k parameters is not enough for you?!

replies(2): >>45676824 #>>45677258 #

strken ◴[23 Oct 25 00:23 UTC] No.45676824[source]▶

>>45676661 #

As it says, bulk inserts with large datasets can fail. Inserting a few thousand rows into a table with 30 columns will hit the limit. You might run into this if you were synchronising data between systems or running big batch jobs.

Sqlite used to have a limit of 999 query parameters, which was much easier to hit. It's now a roomy 32k.

replies(2): >>45676893 #>>45676985 #

1. tym0 ◴[23 Oct 25 00:50 UTC] No.45676985[source]▶

>>45676824 #

Right, for postgres I would use unnest for inserting a non-static amount of rows.

replies(1): >>45677652 #

2. strken ◴[23 Oct 25 02:47 UTC] No.45677652[source]▶

>>45676985 (TP) #

In the past I've used batches of data, inserted into a separate table with all the constraints turned off and using UNNEST, and then inserted into the final table once it was done. We ended up both batching the data and using UNNEST because it was faster but it still let us resume midway through.

We probably should have been partitioning the data instead of inserting it twice, but I never got around to fixing that.

COPY is likely a better option if you have access to the host, or provider-specific extensions like aws_s3 if you have those. I'm sure a data engineer would be able to suggest a better ETL architecture than "shove everything into postgres", too.

replies(1): >>45678736 #

3. devjab ◴[23 Oct 25 06:13 UTC] No.45678736[source]▶

>>45677652 #

Was MERGE too slow/expensive? We tend to MERGE from staging or temporary tables when we sync big data sets. If we were on postgres I think we'd use ... ON CONFLICT, but MERGE does work.

↑