Most active commenters

icedchai(6)
phire(3)

Popular/hot comments

>>45380339 #
>>45380406 #

←back to thread

Athlon 64: How AMD turned the tables on Intel

(dfarq.homeip.net)

Show context

bigstrat2003 ◴[25 Sep 25 19:20 UTC] No.45377613[source]▶

>>45376605 (OP) #

I remember at the time thinking it was really silly for Intel to release a 64-bit processor that broke compatibility, and was very glad AMD kept it. Years later I learned about kernel writing, and I now get why Intel tried to break with the old - the compatibility hacks piled up on x86 are truly awful. But ultimately, customers don't care about that, they just want their stuff to run.

replies(5): >>45377925 #>>45379301 #>>45380247 #>>45385323 #>>45386390 #

wvenable ◴[25 Sep 25 21:23 UTC] No.45379301[source]▶

>>45377613 #

Intel might have been successful with the transition if they didn't decide to go with such radically different and real-world untested architecture for Itanium.

replies(2): >>45379461 #>>45380469 #

pixl97 ◴[25 Sep 25 21:37 UTC] No.45379461[source]▶

>>45379301 #

Well that and Itanium was eyewateringly expensive and standard PC was much cheaper for similar or faster speeds.

replies(1): >>45380251 #

1. Tsiklon ◴[25 Sep 25 22:48 UTC] No.45380251[source]▶

>>45379461 #

I think Itanium was a remarkable success in some other ways. Intel utterly destroyed the workstation market with it. HP-UX, IRIX, AIX, Solaris.

Itanium sounded the deathknell for all of them.

The only Unix to survive with any market share is MacOS, (arguably because of its lateness to the party) and it has only relatively recently went back to a more bespoke architecture

replies(5): >>45380339 #>>45380406 #>>45382516 #>>45383193 #>>45388301 #

2. icedchai ◴[25 Sep 25 22:55 UTC] No.45380339[source]▶

>>45380251 (TP) #

I'd argue it was Linux (on x86) and the dot-com crash that destroyed the workstation market, not Itanium. The early 2000s was awash in used workstation gear, especially Sun. I've never seen anyone with an Itanium box.

replies(3): >>45380551 #>>45381130 #>>45387724 #

3. seabrookmx ◴[25 Sep 25 23:02 UTC] No.45380406[source]▶

>>45380251 (TP) #

HP-UX was one of the most popular operating systems to run on Itanium though?

replies(3): >>45380436 #>>45385510 #>>45386239 #

4. icedchai ◴[25 Sep 25 23:05 UTC] No.45380436[source]▶

>>45380406 #

HP was also one of the few companies to actually sell Itanium systems! They were also the last to stop selling them. They ported both OpenVMS and HP-UX to Itanium.

replies(2): >>45380569 #>>45382543 #

5. tyingq ◴[25 Sep 25 23:18 UTC] No.45380551[source]▶

>>45380339 #

I think the idea there is that it's less direct. Intel's lack of interest in a 64-bit x86 spawned AMD x64. The failure of Itanium then let that Linux/AMD x64 kill off the workstation market, and the larger RISC/Unix market. Linux on 32 bit X86 or 64 bit RISC alone was making some headway there, but the Linux/x64 combo is what enabled the full kill off.

replies(1): >>45385498 #

6. tyingq ◴[25 Sep 25 23:19 UTC] No.45380569{3}[source]▶

>>45380436 #

Well, largely because they made it difficult for customers to stay on PA-RISC, then later, because their competitors were dying off...and if you were in the market for stodgy RISC/Unix there weren't many other choices.

replies(1): >>45386277 #

7. phire ◴[26 Sep 25 00:31 UTC] No.45381130[source]▶

>>45380339 #

While Linux helped, I'd argue the true factor is that x86 failed to die as projected.

The common attitude in the 80s and 90s was that legacy ISAs like 68k and x86 had no future. They had zero chance to keep up with the innovation of modern RISC designs. But not only did x86 keep up, it was actually outperforming many RISC ISAs.

The true factor is out-of-order execution. Some RISC contemporary designs were out-of-order too (Especially Alpha, and PowerPC to a lesser extent), but both AMD and Intel were forced to go all-in on the concept in a desperate attempt to keep the legacy x86 ISA going.

Turns out large out-of-order designs was the correct path (mostly OoO has side effect of being able to reorder memory accesses and execute them in parallel), and AMD/Intel had a bit of a head start, a pre-existing customer base and plenty of revenue for R&D.

IMO, Itanium failed not because it was a bad design, but because it was on the wrong path. Itanium was an attempt to achieve roughly the same end goal as OoO, but with a completely in-order design, relying on static scheduling. It had massive amounts of complexity that let it re-order memory reads. In an alternative universe where OoO (aka dynamic scheduling) failed, Itanium might actually be a good design.

Anyway, by the early 2000s, there just wasn't much advantage to a RISC workstation (or RISC servers). x86 could keep up, was continuing to get faster and often cheaper. And there were massive advantages to having the same ISA across your servers, workstations and desktops.

replies(2): >>45381317 #>>45382983 #

8. chasil ◴[26 Sep 25 01:01 UTC] No.45381317{3}[source]▶

>>45381130 #

Bob Colwell mentions originally doing out of order design at Multiflow.

He was a key player in the Pentium Pro out of order implementation.

https://www.sigmicro.org/media/oralhistories/colwell.pdf

"We should also say that the 360/91 from IBM in the 1960s was also out of order, it was the first one and it was not academic, that was a real machine. Incidentally that is one of the reasons that we picked certain terms that we used for the insides of the P6, like the reservation station that came straight out of the 360/91."

Here is his Itanium commentary:

"Anyway this chip architect guy is standing up in front of this group promising the moon and stars. And I finally put my hand up and said I just could not see how you're proposing to get to those kind of performance levels. And he said well we've got a simulation, and I thought Ah, ok. That shut me up for a little bit, but then something occurred to me and I interrupted him again. I said, wait I am sorry to derail this meeting. But how would you use a simulator if you don't have a compiler? He said, well that's true we don't have a compiler yet, so I hand assembled my simulations. I asked "How did you do thousands of line of code that way?" He said “No, I did 30 lines of code”. Flabbergasted, I said, "You're predicting the entire future of this architecture on 30 lines of hand generated code?" [chuckle], I said it just like that, I did not mean to be insulting but I was just thunderstruck. Andy Grove piped up and said "we are not here right now to reconsider the future of this effort, so let’s move on"."

replies(1): >>45382544 #

9. cryptonector ◴[26 Sep 25 04:04 UTC] No.45382516[source]▶

>>45380251 (TP) #

Absolutely not. Sun destroyed itself and Solaris, not Intel. The others were even more also-rans than Solaris.

replies(1): >>45387199 #

10. sillywalk ◴[26 Sep 25 04:10 UTC] No.45382543{3}[source]▶

>>45380436 #

HP also ported NonStop to Itanium.

11. phire ◴[26 Sep 25 04:10 UTC] No.45382544{4}[source]▶

>>45381317 #

> Bob Colwell mentions originally doing out of order design at Multiflow.

Actually no, it was Metaflow [0] who was doing out-of-order. To quote Colwell:

"I think he lacked faith that the three of us could pull this off. So he contacted a group called Metaflow. Not to be confused with Multiflow, no connection."

"Metaflow was a San Diego group startup. They were trying to design an out of order microarchitecture for chips. Fred thought what the heck, we can just license theirs and remove lot of risk from our project. But we looked at them, we talked to their guys, we used their simulator for a while, but eventually we became convinced that there were some fundamental design decisions that Metaflow had made that we thought would ultimately limit what we could do with Intel silicon."

Multiflow, [1] where Colwell worked, has nothing to do with OoO, its design is actually way closer to Itanium. So close, in-fact that the Itanium project is arguably a direct decedent of Multiflow (HP licensed the technology, and hired Multiflow's founder, Josh Fisher). Colwell claims that Itainum's compiler is nothing more than the Multiflow compiler with large chunks rewritten for better performance.

[0] https://en.wikipedia.org/wiki/Metaflow_Technologies

[1] https://en.wikipedia.org/wiki/Multiflow

replies(1): >>45382718 #

12. chasil ◴[26 Sep 25 04:41 UTC] No.45382718{5}[source]▶

>>45382544 #

I thoroughly acknowledge and enjoy your clarification.

13. stevefan1999 ◴[26 Sep 25 05:25 UTC] No.45382983{3}[source]▶

>>45381130 #

> The true factor is out-of-order execution.

I'm pressing X: the doubt button.

I would argue that speculative execution/branch prediction and wider pipeline, both of which that OoO largely benefitted from, would be more than OoO themselves to be the sole factor. In fact I believe the improvement in semiconductor manufacturing process node could contribute more to the IPC gain than OoO itself.

replies(1): >>45383517 #

14. inkyoto ◴[26 Sep 25 06:00 UTC] No.45383193[source]▶

>>45380251 (TP) #

Looking back, I think we can now conclude that it was largely inevitable for the other designs to fade sooner or later – and that is what has happened.

The late 90's to the early aughts' race for highest-frequency, highest-performance CPUs exposed not a need for a CPU-only, highly specialised foundry, but a need for sustained access to the very front of process technology – continuous, multibillion-dollar investment and a steep learning curve. Pure-play foundries such as TSMC could justify that spend by aggregating huge, diverse demand across CPU's, GPU's and SoC's, whilst only a handful of integrated device manufacturers could fund it internally at scale.

The major RISC houses – DEC, MIPS, Sun, HP and IBM – had excellent designs, yet as they pushed performance they repeatedly ran into process-cadence and capital-intensity limits. Some owned fabs but struggled to keep them competitive; others outsourced and were constrained by partners’ roadmaps. One can trace the pattern in the moves of the era: DEC selling its fab, Sun relying on partners such as TI and later TSMC, HP shifting PA-RISC to external processes, and IBM standing out as an exception for a time before ultimately stepping away from leading-edge manufacturing as well.

A compounding factor was corporate portfolio focus. Conglomerates such as Motorola, TI and NEC ran diversified businesses and prioritised the segments where their fab economics worked best – often defence, embedded processors and DSP's – rather than pouring ever greater sums into low-volume, general-purpose RISC CPU's. IBM continued to innovate and POWER endured, but industry consolidation steadily reduced the number of independent RISC CPU houses.

In the end, x86 benefited from an integrated device manufacturer (i.e. Intel) with massive volume and a durable process lead, which set the cadence for the rest of the field. The outcome was less about the superiority of a CPU-only foundry and more about scale – continuous access to the leading node, paid for by either gigantic internal volume or a foundry model that spread the cost across many advanced products.

replies(1): >>45384184 #

15. phire ◴[26 Sep 25 06:50 UTC] No.45383517{4}[source]▶

>>45382983 #

To be clear, when I (and most people) say OoO, I don't mean just the act of executing instructions out-of-order. I mean the whole modern paradigm of "complex branch predictors, controlling wide front-ends, feeding schedulers with wide back-ends and hundreds or even thousands of instructions in flight".

It's a little annoying that OoO is overloaded in this way. I have seen some people suggesting we should be calling these designs "Massively-Out-of-Order" or "Great-Big-Out-of-Order" in order to be more specific, but that terminology isn't in common use.

And yes, there are some designs out there which are technically out-of-order, but don't count as MOoO/GBOoO. The early PowerPC cores come to mind.

It's not that executing instructions out-of-order benefits from complex branch prediction and wide execution units, OoO is what made it viable to start using wide execution units and complex branch prediction in the first place.

A simple in-order core simply can't extract that much parallelism, the benefits drop off quickly after two-wide super scalar. And accurate branch prediction is of limited usefulness when the pipeline is that short.

There are really only two ways to extract more parallelism. You either do complex out-of-order scheduling (aka dynamic scheduling), or you take the VLIW approach and try to solve it with static scheduling, like the Itanium. They really are just two sides of the same "I want a wide core" coin.

And we all know how badly the Itanium failed.

replies(1): >>45383734 #

16. stevefan1999 ◴[26 Sep 25 07:20 UTC] No.45383734{5}[source]▶

>>45383517 #

> I mean the whole modern paradigm of "complex branch predictors, controlling wide front-ends, feeding schedulers with wide back-ends and hundreds or even thousands of instructions in flight".

Ah, the philosophy of having the CPU execution out of ordered, you mean.

> A simple in-order core simply can't extract that much parallelism

While yes, it is also noticable that it does not have data hazard because a pipeline simply doesn't exist at all, and thus there is no need for implicit pipeline bubble or delay slot.

> And accurate branch prediction is of limited usefulness when the pipeline is that short.

You can also use a software virtual machine to turn an out-of-order CPU into basically running in-order code and you can see how slow that goes. That's why JIT VM such as HotSpot and GraalVM for JVM platform, RyuJIT for CoreCLR, and TurboFan for V8 is so much faster, because when you compile them to native instruction, the branch predictor could finally kick in.

> like the Itanium > And we all know how badly the Itanium failed.

Itanium is not exactly VLIW. It is an EPIC [^1] fail though.

[1]: https://en.wikipedia.org/wiki/Explicitly_parallel_instructio...

17. jabl ◴[26 Sep 25 08:32 UTC] No.45384184[source]▶

>>45383193 #

Yes. AFAIU the cost of process R&D and building and running leading-edge fabs massively outweigh the cost of CPU architecture R&D. It's just a world of its own largely out the comfort zone of software people, hence we endlessly debate the merits of this or that ISA, or this or that microarchitecture, a bit like the drunkard searching for his keys under the streetlamp.

It's also interesting to note that back then the consensus was that you needed your own in-house fab with tight integration between the fab and CPU design teams to build the highest performance CPU's. Merchant fabs were seen as second-best options for those who didn't need the highest performance or couldn't afford their own in-house fab. Only later did the meteoric rise of TSMC to the top spot on the semiconductor food chain upend that notion.

18. p_l ◴[26 Sep 25 12:03 UTC] No.45385498{3}[source]▶

>>45380551 #

Intel's lack of interest in delivering 64bit for "peons" running x86 also was part - I remember when first discussion in popular computer magazines showed of amd64, that intel's proposed timeline was discussed, and it very much indicated a wish to push for "buy our super expensive stuff" and trying to squeeze money.

Meanwhile the decision to keep Itanium on expensive but lower-volume market meant that there simply wasn't much market growth, especially once non-technical part of killing other RISCs failed. Ultimately Itanium was left as recommended way in some markets to run Oracle databases (due to partnership between Oracle and HP) and not much else, while shops that used other RISC platforms either migrated to AMD64, or moved to other RISC platforms (even forcing HP to resurrect Alpha for last one gen)

19. p_l ◴[26 Sep 25 12:04 UTC] No.45385510[source]▶

>>45380406 #

Oracle sales would push you towards HP-UX on Itanium as recommended platform.

To the point that once that ended with Oracle's purchase of Sun, there was a lawsuit between Oracle and HP. And a lot of angry customers as HP-UX was pushed to the last moment of acquisition announcement.

20. bluedino ◴[26 Sep 25 13:23 UTC] No.45386239[source]▶

>>45380406 #

That's what we ran. Core system was written on PICK Basic in the 80's and it just kept going on and on. I was buying HP Integrity (Itanium line) spare parts on eBay up until about 10 years ago.

21. icedchai ◴[26 Sep 25 13:27 UTC] No.45386277{4}[source]▶

>>45380569 #

As for RISC/Unix, in the enterprise, IBM's POWER/AIX is still around. I know some die hard IBM shops still using it.

I guess Oracle / Sun sparc is also still hanging on. I haven't seen a Sun shop since the early 2000's...

replies(1): >>45388756 #

22. icedchai ◴[26 Sep 25 14:52 UTC] No.45387199[source]▶

>>45382516 #

If Sun had been more liberal with Solaris licensing on x86 in the early years (before, say, 2000), we might all be running Solaris servers today. Sun / Solaris was the Unix for most of the 90's through the dot-com crash.

Almost all early startups I worked with were Sun / Solaris shops. All the early ISPs I worked with had Sun boxes for their customer shell accounts and web hosts. They put the "dot in dot-com", after all...

23. kjs3 ◴[26 Sep 25 15:37 UTC] No.45387724[source]▶

>>45380339 #

Yup. I had a front row seat. So many discussion with startups in the 2Ks that boiled down to "we can get a Sun/HP/DEC machine, or we can get 4-5 nice Wintel boxes running Linux for the same price". So at the point where everyone figured out Linux was a 'good enough' Unix for dev work and porting to the incumbents was a reasonable prospect, it was "so do we all want to share one machine or go find 500% more funding just to have the marquis brand". Once you made that leap, "we don't need the incumbents" because inevitable.

replies(1): >>45388673 #

24. cameldrv ◴[26 Sep 25 16:28 UTC] No.45388301[source]▶

>>45380251 (TP) #

If you're counting all desktop/server computers, Linux has way more market share than all of the Unices ever did. It's probably even true for desktop Linux. If you count mobile phones, Android is a Linux derivative, and iOS is a BSD derivative. The fundamental issue for the workstation vendors was simply that with the P6, Intel was near parity or even ahead of the workstation vendors in performance, and it cost something like 1/4 as much.

25. icedchai ◴[26 Sep 25 17:00 UTC] No.45388673{3}[source]▶

>>45387724 #

It was amazing how fast that happened. I remember one startup mainly supported Sun, late 90's, early 2000's. This was for a so called "enterprise" app that would run on-prem. They wanted me to move the app to Linux (Red Hat, I think?) so they could take it to a trade show booth without reliable Internet access. It was a pretty simple port.

26. kjs3 ◴[26 Sep 25 17:09 UTC] No.45388756{5}[source]▶

>>45386277 #

There's still a lot of AIX around and the LoB is seeing revenue growth. You just don't hear about it on HN because it's mostly doing mundane, mission critical stuff buried in large orgs.

I still run into a number of Solaris/SPARC shops, but even the most die hard of them are actively looking for the off-ramp. The writing is on that wall.

replies(1): >>45388855 #

27. icedchai ◴[26 Sep 25 17:21 UTC] No.45388855{6}[source]▶

>>45388756 #

I believe it! For a few years, I worked on fairly large system deployed to an AIX environment. The hardware and software were both rock solid. While I haven't used it, the performance of the newer POWER stuff looks incredible.

↑