Google Titans architecture, helping AI have long-term memory

(research.google)

584 points Alifatisk | 1 comments | 07 Dec 25 12:23 UTC | HN request time: 0.356s | source

Show context

okdood64 ◴[07 Dec 25 14:05 UTC] No.46181759[source]▶

From the blog:

Is there any other company that's openly publishing their research on AI at this level? Google should get a lot of credit for this.

replies(12): >>46181829 #>>46182057 #>>46182168 #>>46182358 #>>46182633 #>>46183087 #>>46183462 #>>46183546 #>>46183827 #>>46184875 #>>46186114 #>>46189989 #

Palmik ◴[07 Dec 25 20:33 UTC] No.46184875[source]▶

>>46181759 #

DeepSeek and other Chinese companies. Not only do they publish research, they also put their resources where their mouth (research) is. They actually use it and prove it through their open models.

Most research coming out of big US labs is counter indicative of practical performance. If it worked (too) well in practice, it wouldn't have been published.

Some examples from DeepSeek:

https://arxiv.org/abs/2405.04434

https://arxiv.org/abs/2502.11089

replies(1): >>46186643 #

abbycurtis33[dead post] ◴[07 Dec 25 23:51 UTC] No.46186643[source]▶

>>46184875 #

[flagged]

CGMthrowaway[dead post] ◴[07 Dec 25 23:58 UTC] No.46186712[source]▶

>>46186643 #

[flagged]

elmomle ◴[08 Dec 25 00:41 UTC] No.46187015[source]▶

>>46186712 #

Your comment seems to imply "these views aren't valid" without any evidence for that claim. Of course the theft claim was a strong one to make without evidence too. So, to that point--it's pretty widely accepted as fact that DeepSeek was at its core a distillation of ChatGPT. The question is whether that counts as theft. As to evidence, to my knowledge it's a combination of circumstantial factors which add up to paint a pretty damning picture:

(1) Large-scale exfiltration of data from ChatGPT when DeepSeek was being developed, and which Microsoft linked to DeepSeek

(2) DeepSeek's claim of training a cutting-edge LLM using a fraction of the compute that is typically needed, without providing a plausible, reproducible method

(3) Early DeepSeek coming up with near-identical answers to ChatGPT--e.g. https://www.reddit.com/r/ChatGPT/comments/1idqi7p/deepseek_a...

replies(4): >>46187080 #>>46187116 #>>46188534 #>>46189289 #

1. nl ◴[08 Dec 25 07:06 UTC] No.46189289[source]▶

>>46187015 #

> Large-scale exfiltration of data from ChatGPT when DeepSeek was being developed, and which Microsoft linked to DeepSeek

This is not the same thing at all. Current legal doctrine is that ChatGPT output is not copyrightable, so at most Deepseek violated the terms of use of ChatGPT.

That isn't IP theft.

To add to that example, there are numerous open-source datasets that are derived from ChatGPT data. Famously, the Alpaca dataset kick-started the open source LLM movement by fine tuning Llama on a GPT-derived dataset: https://huggingface.co/datasets/tatsu-lab/alpaca

↑