Huawei cloned Qwen and DeepSeek models, claimed as own

1. tengbretson ◴[06 Jul 25 16:42 UTC] No.44482169[source]▶

>>44482051 (OP) #

In the LLM intellectual property paradigm, I think this registers as a solid "Who cares?" level offence.

replies(7): >>44482174 #>>44482176 #>>44482191 #>>44482209 #>>44482275 #>>44482276 #>>44482505 #

2. brookst ◴[06 Jul 25 16:43 UTC] No.44482176[source]▶

The point isn’t some moral outrage over IP, the point is a company may be falsely claiming to have expertise it does not have, which is meaningful to people who care about the market in general.

replies(2): >>44482214 #>>44482447 #

3. didibus ◴[06 Jul 25 16:45 UTC] No.44482191[source]▶

>>44482169 (TP) #

Ya, the models have stolen everyone's copyrighted intellectual property already. Not sure I have a lot of sympathy, in fact, the more the merrier, if we're going to brush off that they're all trained on copyrighted material, might as well make sure they end up a really cheap, competitive, low margin, accessible commodity.

replies(1): >>44482313 #

4. oblio ◴[06 Jul 25 16:47 UTC] No.44482206[source]▶

>>44482174 #

Didn't know Sam Altman was Chinese :-)

5. esskay ◴[06 Jul 25 16:47 UTC] No.44482209[source]▶

>>44482169 (TP) #

It is very hard to have any sympathy, they stole stolen material from people known to not care they are stealing.

6. tonyedgecombe ◴[06 Jul 25 16:48 UTC] No.44482214[source]▶

>>44482176 #

Nobody who pays attention to Huawei will be surprised. They have a track record of this sort of behaviour going right back to their early days.

replies(1): >>44482448 #

7. some_random ◴[06 Jul 25 16:56 UTC] No.44482275[source]▶

>>44482169 (TP) #

Claiming to care deeply about IP theft in the more nebulous case of model training datasets then dismissing the extremely concrete case of outright theft seems pretty indefensible to me.

replies(3): >>44482363 #>>44482381 #>>44482529 #

8. ◴[06 Jul 25 16:56 UTC] No.44482276[source]▶

>>44482169 (TP) #

9. lambdasquirrel ◴[06 Jul 25 17:01 UTC] No.44482313[source]▶

>>44482191 #

Eh... you should read the article. It sounds like a pretty big deal.

replies(1): >>44484179 #

10. perching_aix ◴[06 Jul 25 17:07 UTC] No.44482363[source]▶

>>44482275 #

Par for the course for emotional thinking, I'm not even surprised anymore.

11. Arainach ◴[06 Jul 25 17:09 UTC] No.44482381[source]▶

>>44482275 #

Everyone has a finite amount of empathy, and I'm not going to waste any of mine on IP thieves complaining that someone stole their stolen IP from them.

replies(1): >>44483260 #

12. ◴[06 Jul 25 17:19 UTC] No.44482447[source]▶

>>44482176 #

13. npteljes ◴[06 Jul 25 17:19 UTC] No.44482448{3}[source]▶

>>44482214 #

While true, these sorts of reports are the track records which we can base our assessments on.

14. ◴[06 Jul 25 17:26 UTC] No.44482505[source]▶

>>44482169 (TP) #

15. pton_xd ◴[06 Jul 25 17:29 UTC] No.44482529[source]▶

>>44482275 #

> dismissing the extremely concrete case of outright theft seems pretty indefensible to me.

Outright theft is a meaningless term here. The new rules are different.

The AI space is built on "traditionally" bad faith actions. Misappropriation of IP by using pirated content and ignoring source code licenses. Borderline malicious website scraping. Recitation of data without attribution. Copying model code / artifacts / weights is just the next most convenient course of action. And really, who cares? The ethical operating standards of the industry have been established.

16. mensetmanusman ◴[06 Jul 25 19:22 UTC] No.44483260{3}[source]▶

>>44482381 #

It’s theft in the way taking a picture of nature that you had nothing to do with is theft.

replies(1): >>44483333 #

17. Arainach ◴[06 Jul 25 19:31 UTC] No.44483333{4}[source]▶

>>44483260 #

This line of argument was worn out and tired when 14 year olds on Napster were parroting it in 1999.

18. didibus ◴[06 Jul 25 21:24 UTC] No.44484179{3}[source]▶

>>44482313 #

I did read the article, appart for that it sounds like a terrible place to work, I'm not sure I see what's the big deal?

No one knows how any of the models got made, their training data is kept secret, we don't know what it contains, and so on. I'm also pretty sure a few of the main models poached each others employees which just reimplemented the same training models with some twists.

Most LLMs are also based on initial research papers where most of the discovery and innovation took place.

And in the very end, it's all trained on data that very few people agreed or intended would be used for this purpose, and for which they all won't see a dime.

So why not wrap and rewrap models and resell them, and let it all compete for who offers the cheapest plan or per-token cost?