>Not only that, they also assume or pretend that this is obviously violating copyright, when in fact this is a) not clear, and b) pending determination by courts and legislators around the world.
Legislation always takes time to catch up with tech, that's not new.
The question I'm see being put forth from those with legal and IP backgrounds is about inputs vs. outputs, as in "if you didn't have access to X (which has some form of legal IP protection) as an input, would you be able to get the output of a working model?" The comparison here is with manufacturing where you have assembly of parts made by others into some final product and you would be buying those inputs to create your product output.
The cost of purchasing the required inputs is not being done for AI, which pretty solidly puts AI trained on copyrighted materials in hot water. The fact that it's an imperfect analogy and doesn't really capture the way software development works is irrelevant if the courts end up agreeing with something they can understand as a comparison.
All that being said I don't think the legality is under consideration for any companies building a model - the profit margins are too high to care for now, and catching them at it is potentially difficult.
There's also a tendency for AI advocates to try and say that AI/LLM's are "special" in some way, and to compare their development process to someone "learning" the style of art (or whatever input) that they then internalize and develop into their own style. Personally I think that argument gives a lot of assumed agency to these models that they don't actually have, and weakens the overall legal case.