> What's the right solution? It's case by case, down to a mixture of morality and expertise to decide.
I think the idea of minimizing harm is a really good one.
I've never done any machine learning type stuff, but, based on my limited understanding, I think there are probably a few issues at play that make things difficult.
I think the feedback loop for an algorithm is likely important. If you're training an algorithm to match fingerprints, you have a few things that work in your favor. First, matching is easier with fewer samples, so you can train the model incrementally with larger and larger data sets. Second, the process of identifying false positives is easy, relatively definitive, and isn't influenced by external factors. If the ML algorithm only has X% confidence you send it to a human who assesses the match and tells the algorithm the answer so it can "learn" for the next situation that's similar.
Contrast that with something like payment processing. First, you need to scale with demand and it's not easy to incrementally train the algorithm. Second, false positives don't have a tight feedback loop. A false positive negatively affects a customer and every case is different. You need to rely on external, subjective data that isn't definitive enough to be useful to an algorithm (IMO).
I think matching fingerprints is a good analogy to illustrate some of the problems, especially when you hear things along the lines of "looked too similar to fraudulent activity." With fingerprints, you could give 10 to an amateur and they could probably match them accurately. Scale that up to 10,000 and you have so many that look similar, but not identical and you need a professional to do the matching.
I think ML is similar. It's better on a small scale than it is on a large scale and just doesn't scale up as well as the sales pitch says (unless it's assessing problems with definitive solutions). The issue here is that tech companies are treating ML like it scales in a linear fashion. Just throw more compute at it and 10x the scale, right? Wrong (IMO).
There was another comment here that said something along the lines of getting to 98% accuracy and deciding not to serve the other 2%. I think that's what's happening everywhere, but rather than explicitly telling customers they're not welcome, companies are simply letting their ML algorithms run to find the equilibrium where they can manage the "not positive" rate.
And that goes back to your idea of minimizing harm. They don't want to. They don't care if they promise you service even though you're borderline in terms of triggering false positives. You're part of the data set for their machine learning algorithm and that means you're viewed as acceptable collateral damage. They'll ruin your life to train their ML algorithm(s).