This subject is fascinating and the article is informative, but I wish that HN had a button like "flag", but specific for articles that seems written by AI (well at least the section "How STARFlow compares with OpenAI’s 4o image generator" sounds like it)

replies(4): >>44393696 #>>44393957 #>>44394563 #>>44401032 #

5. kelseyfrog ◴[27 Jun 25 03:35 UTC] No.44393483[source]▶

>>44358535 (OP) #

Forgotten from like 2021? NVAE[1] was a great paper but maybe four years is long enough to be forgotten in the AI space? shrug

1. NVAE: A Deep Hierarchical Variational Autoencoder https://arxiv.org/pdf/2007.03898

replies(2): >>44393507 #>>44393824 #

6. bbminner ◴[27 Jun 25 03:40 UTC] No.44393507[source]▶

>>44393483 #

Right, it is bizzare to read that someone "unearthed a forgotten AI technique" that you happened to have worked with/on when it was still hot - when did I become a fossil? :D

Also, if we're being nitpicky, diffusion model inference has been proven equivalent to (and is often used as) a particular NF so.. shrug

7. sipjca ◴[27 Jun 25 03:41 UTC] No.44393509[source]▶

>>44393382 #

somewhat hard to say how the cards fall when the cost of 'intelligence' is coming down 1000x year over year while at the same time compute continues to scale. the bet should be made on both sides probably

replies(1): >>44393733 #

8. peepeepoopoo137 ◴[27 Jun 25 04:07 UTC] No.44393622[source]▶

>>44393382 #

"""The bitter lesson""" is how you get the current swath of massively unprofitable AI companies that are competing with each other over who can lose money faster.

replies(1): >>44393727 #

9. CharlesW ◴[27 Jun 25 04:25 UTC] No.44393696[source]▶

>>44393474 #

FWIW, you can always report any HN quality concerns to hn@ycombinator.com and it'll be reviewed promptly and fairly (IMO).

10. furyofantares ◴[27 Jun 25 04:33 UTC] No.44393727{3}[source]▶

>>44393622 #

I can't tell if you're perpetuating the myth that these companies are losing money on their paid offerings, or just overestimating how much money they lose on their free offerings.

replies(1): >>44394618 #

11. furyofantares ◴[27 Jun 25 04:34 UTC] No.44393733{3}[source]▶

>>44393509 #

10x year over year, not 1000x, right? The 1000x is from this 10x observation having held for 3 years.

replies(1): >>44416258 #

12. nabla9 ◴[27 Jun 25 05:01 UTC] No.44393824[source]▶

>>44393483 #

They are both variational inference, but Normalizing Flow (NF) is not VAE.

replies(1): >>44394090 #

13. bitpush ◴[27 Jun 25 05:34 UTC] No.44393954[source]▶

>>44358535 (OP) #

I find it fascinating that Apple-centric media sites are stretching so much to position the company in the AI race. The title is meant to say that Apple found something unique that other people missed, when the simplest explanation is they started working on this a while back (2021 paper, afterall) and just released it.

A more accurate headline would be - Apple starting to create images using 4 year old techniques.

replies(8): >>44394030 #>>44394108 #>>44394182 #>>44394662 #>>44394807 #>>44395238 #>>44395774 #>>44401963 #

14. Veen ◴[27 Jun 25 05:35 UTC] No.44393957[source]▶

>>44393474 #

It reads like the work of a professional writer who uses a handful of variant sentence structures and conventions to quickly write an article. That’s what professional writers are trained to do.

15. danhau ◴[27 Jun 25 05:57 UTC] No.44394030[source]▶

>>44393954 #

This „4 year old technique“ apparently could give Apple an edge for on-device workloads.

> short: both Apple and OpenAI are moving beyond diffusion, but while OpenAI is building for its data centers, Apple is clearly building for our pockets.

replies(1): >>44394277 #

16. kelseyfrog ◴[27 Jun 25 06:08 UTC] No.44394090{3}[source]▶

>>44393824 #

If you read the paper, you'll find "More Expressive Approximate Posteriors with Normalizing Flows" is in the methods section. The authors are in fact using (inverse) normalizing flows within the context of VAEs.

The appendix goes on to explain, "We apply simple volume-preserving normalizing flows of the form z′ = z + b(z) to the samples generated by the encoder at each level".

17. politelemon ◴[27 Jun 25 06:12 UTC] No.44394108[source]▶

>>44393954 #

> I find it fascinating that Apple-centric media sites are stretching so much to position the company in the AI race.

A glance through the comments also shows HNers doing their best too. The mind still boggles as to why this site is so willing to perform mental gymnastics for a corporate.

replies(1): >>44395053 #

18. rTX5CMRXIfFG ◴[27 Jun 25 06:26 UTC] No.44394182[source]▶

>>44393954 #

That site's target market is what we know as "Apple fanboys". I'm not one to consider 9to5 serious journalism (nor even worthy to post in HN), but even those publications that I consider serious are businesses, too, and need to pander to their markets in order to make money.

19. 7speter ◴[27 Jun 25 06:29 UTC] No.44394202{3}[source]▶

>>44393454 #

Maybe for a big llm, but if they add some gpu cores and added a magnitude or 2 more unified memory to their i devices, or shoehorned m socs into high tier iDevices (especially as their lithography process advances), image generation becomes more viable, no? Also, I thought I read somewhere that apple wanted to infer simpler queries locally and switch to datacenter inference when the request was more complicated.

If they approach things this way, and transistor progress continues linearly (relative to the last few years) maybe they can make their first devices that can meet these goals in… 2-3 years?

20. bitpush ◴[27 Jun 25 06:45 UTC] No.44394277{3}[source]▶

>>44394030 #

The same edge Apple had summarizing notifications so poorly that they had to turn it off?

https://arstechnica.com/apple/2024/11/apple-intelligence-not...

replies(1): >>44394511 #

21. janalsncm ◴[27 Jun 25 07:27 UTC] No.44394511{4}[source]▶

>>44394277 #

That was a bad and unnecessary feature but the privacy benefits of running a model on device rather than in the cloud are undeniable.

replies(1): >>44396980 #

22. janalsncm ◴[27 Jun 25 07:34 UTC] No.44394563[source]▶

>>44393474 #

I had the opposite reaction, it definitely reads like a tech journalist who doesn’t have a great understanding of the tech. AI would’ve written a less clunky (and possibly incorrect) explanation.

23. janalsncm ◴[27 Jun 25 07:38 UTC] No.44394586[source]▶

>>44393382 #

The bitter-er lesson is that distillation from bigger models works pretty damn well. It’s great news for the GPU poor, not great for the guys training the models we distill from.

replies(1): >>44401947 #

24. janalsncm ◴[27 Jun 25 07:41 UTC] No.44394618{4}[source]▶

>>44393727 #

If it costs you a billion dollars to train a GPT5 and I can distill your model for a million dollars and get 90% of the performance, that’s a terrible deal for you. Or more realistically, whoever you borrowed from.

replies(1): >>44401954 #

25. darkstar_16 ◴[27 Jun 25 07:48 UTC] No.44394662[source]▶

>>44393954 #

I think its just Apple PR pushing these out now to get Apple's name out in the AI era.

26. yorwba ◴[27 Jun 25 07:58 UTC] No.44394727[source]▶

>>44393382 #

They took a simple technique (normalizing flows), instantiated its basic building blocks with the most general neural network architecture known to work well (transformer blocks), and trained models of different sizes on various datasets to see whether it scales. Looks very bitter-lesson-pilled to me.

That they didn't scale beyond AFHQ (high-quality animal faces: cats, dogs and big cats) at 256×256 is probably not due to an explicit preference for small models at the expense of output resolution, but because this is basic research to test the viability of the approach. If this ever makes it into a product, it'll be a much bigger model trained on more data.

EDIT: I missed the second paper https://arxiv.org/abs/2506.06276 where they scale up to 1024×1024 with a 3.8-billion-parameter model. It seems to do about as well as diffusion models of similar size.

27. ◴[27 Jun 25 08:08 UTC] No.44394807[source]▶

>>44393954 #

28. amelius ◴[27 Jun 25 08:49 UTC] No.44395053{3}[source]▶

>>44394108 #

We seriously need an AI to dampen the reality distorion field and bring back common sense. Maybe it can be something that people install in their browsers.

29. coldtea ◴[27 Jun 25 09:24 UTC] No.44395238[source]▶

>>44393954 #

>I find it fascinating that Apple-centric media sites are stretching so much to position the company in the AI race

Or, you know, just posting an article based on an Apple's press release about a new technique that falls squarely into their target audience (people reading Apple centric news) and is a great fit to current fashionable technologies (AI) people will show interest in.

Without giving a fuck to "position the company in the AI race". They'd post about Apple sewers having an issue at their HQs, if that news story was available.

Besides, when did Apple ever came first in some particular tech race (say, the mp3 player, the smartphone, the store, the tablet, the smartwatch, maybe VR now)? What they do typically is wait for the dust to settle and sweep the end-user end of that market.

replies(1): >>44396911 #

30. niyyou ◴[27 Jun 25 11:04 UTC] No.44395774[source]▶

>>44393954 #

It's not even some "forgotten AI technique" (sigh...). It's been a hot topic for the last 5 years. Used a lot with Variational Auto-encoders, etc. Such a bad journalism.

31. bitpush ◴[27 Jun 25 14:06 UTC] No.44396911{3}[source]▶

>>44395238 #

Precisely. Remember how they waited for AR/VR space to settle and then swept the end user market?

Or the smash hit Homepods.

Or Siri :)

replies(1): >>44400816 #

32. bitpush ◴[27 Jun 25 14:17 UTC] No.44396980{5}[source]▶

>>44394511 #

The fact that they shipped it shows they don't know what they were doing, private or not.

replies(1): >>44399686 #

33. janalsncm ◴[27 Jun 25 19:44 UTC] No.44399686{6}[source]▶

>>44396980 #

That’s a little unfair imo. Statistical models make mistakes and have failure modes which are difficult to predict.

When the bug popped up, turning the feature off was easier than retraining and redeploying.

34. npinsker ◴[27 Jun 25 22:23 UTC] No.44400816{4}[source]▶

>>44396911 #

The VR space doesn’t seem settled to me. And I think Apple could win — having tried most of the major models, Apple’s is noticeably better. It really does feel like magic (other than the weight).

35. lukan ◴[27 Jun 25 22:54 UTC] No.44401032[source]▶

>>44393474 #

If you enjoyed the article, why would you want to flag or tag it? For what purpose?

replies(1): >>44402635 #

36. rfv6723 ◴[28 Jun 25 02:26 UTC] No.44401947{3}[source]▶

>>44394586 #

Distillation is great for researchers and hobbyists.

But nearly all frontier models have anti-distillation ToS, so distillation is out of question for western commercial companies like Apple.

replies(1): >>44402386 #

37. rfv6723 ◴[28 Jun 25 02:30 UTC] No.44401954{5}[source]▶

>>44394618 #

Then if you offer your distilled model for commercial services, you would get sued by OpenAI in court.

38. msgodel ◴[28 Jun 25 02:33 UTC] No.44401963[source]▶

>>44393954 #

Given the tiny amount of funding they have Apple's ML team does some really amazing stuff. I think they're actually underappreciated by the public.

39. tomhow ◴[28 Jun 25 03:32 UTC] No.44402134[source]▶

>>44358535 (OP) #

Comments moved to https://news.ycombinator.com/item?id=44400105.

40. janalsncm ◴[28 Jun 25 04:55 UTC] No.44402386{4}[source]▶

>>44401947 #

Even if Apple needs to train an LLM from scratch, they can distill it and deploy on edge devices. From that point, inference is free to them.

41. nextaccountic ◴[28 Jun 25 06:18 UTC] No.44402635{3}[source]▶

>>44401032 #

Well maybe this article isn't AI written after all. But the intent was adding an (AI) besides the title.

42. sipjca ◴[29 Jun 25 20:40 UTC] No.44416258{4}[source]▶

>>44393733 #

I believe the 1000x number I pulled is from SemiAnalysis or similar, using MMLU as the baseline benchmark and the cost per token from a year ago to today at the same score. Model improvements, hardware improvements and software improvements all make a massive difference when combined to make much greater than 10x gains in terms to intelligence/$

↑