AMD Sets a Course for 2008

Thom Holwerda 2006-06-01 AMD 10 Comments

AMD on Thursday laid out plans to serve 30 percent of the market within the next two years, with new quad-core processor designs scheduled for 2007 and an acceleration of its manufacturing capabilities. The company also talked about plans to build future processors with the ability to mix and match the building blocks of a chip to cater to different needs, and to allow its partners to add co-processors that can link directly to Opteron processors through AMD’s Hypertransport links.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

10 Comments

2006-06-01 10:41 pm
mario
AMD is selling each of the opterons it bakes, quicker than they can cool down. (and one of the reasons is because they run so cool – ok, enough of the puns and analogies).
I am glad, I always liked AMD’s products and company phylosophy. To be honest, I have the same posture towards Sun, so obviously, I think the Galaxy servers are the cat’s ass (opposite of rat’s ass, as in, really, really good stuff). I suppose Sun sold a large part of those Opterons, too.
2006-06-02 2:06 am
butters
I liked everything Hester and Ruiz said in this article (and the links to other recent cnet articles), but I think the interviewer hit a nerve when he mentioned the Cell processor. The response was that it has its place, but it’s not as good a design as their “external HT processor bus” because it’s harder to program and doesn’t do general-purpose computing as well.
As for the Cell being harder to program, that’s true. But it will also be harder to program AMD’s chips to efficiently offload certain tasks to the coprocessor. Any processing solution that involves specialized hardware for specialized workloads with either require skilled programmers or a large investment in compiler technology. In the case of the Cell, a sophisticated API was added to GCC to expose the various programming models to the developer. I’m sure AMD’s new system will work similarly.
A quad-core Opteron will outperform a stripped-down, in-order PPC970 in general-purpose computer, no doubt. But for applications where the customer needs really fast Java code execution, general-purpose logic doesn’t really matter much. As the PS3 will prove, once you take those critical repetitive tasks and offload them to dedicated coprocessors, the remaining control logic is simple enough that a PPC970 is more than enough. The more coprocessors you have, and the more flexibly each can be programmed, the less the system relies on general-purpose arithmetic logic.
So I think the Cell is such a more radical approach to the same philosophy AMD is trumpeting, it’s just that AMD is offering a smoother migration path from general-purpose computing to specialized, distributed computing.

2006-06-02 4:08 am
Brendan
As for the Cell being harder to program, that’s true. But it will also be harder to program AMD’s chips to efficiently offload certain tasks to the coprocessor. Any processing solution that involves specialized hardware for specialized workloads with either require skilled programmers or a large investment in compiler technology. In the case of the Cell, a sophisticated API was added to GCC to expose the various programming models to the developer. I’m sure AMD’s new system will work similarly.
Considering recent acquisitions, I’m wondering if “coprocessor” means “GPU” – connecting the video directly to a hyper-transport link would have it’s advantages, and wouldn’t be any harder to program than the “non-standardized mess” that ATI and NVidea have provided so far…
After that, how about a physics processor? 🙂
2006-06-02 12:03 pm
netpython
So I think the Cell is such a more radical approach to the same philosophy AMD is trumpeting,
On the contrary the cell is still nice in theory but as a pragmatist i believe more in what is running well *now* and that’s without any personal doubt the whole AMD spectrum.
2006-06-02 8:27 pm
rayiner
A quad-core Opteron will outperform a stripped-down, in-order PPC970 in general-purpose computer, no doubt.
Cell doesn’t have anything resembling a PPC970 in it. If it did, it’d be a whole different animal. The cores in the Cell are an extremely simple 2-issue in-order PPE, and an even simplier 2-issue in-order SPE. Think performance characteristics closer to PPC601 than PPC970 and you’ll have the right idea.
But for applications where the customer needs really fast Java code execution, general-purpose logic doesn’t really matter much.
Huh? What exactly do you think Java code profiles like? Java code is very branchy and very memory-operation intensive, things that Cell is not optimized for. This profile is characteristic of most high-level OOP languages, and is especially characteristic of business applications (the main use of Java) written in such languages.
Cell is pretty much a worst-case scenario for a language like Java. First, it’s got long-pipelines (18 stages, versus 12 for the Opteron), resulting in a large branch misprediction penalty. It’s also got little (PPE) to no (SPE) branch prediction resources, making it very difficult to hide that branch prediction penalty. It’s in-order, which makes code scheduling hard, and it’s got long primitive latencies, which makes code scheduling even harder. The minimum memory load on Cell is 5 cycles (L1 cache hit on the PPE) or 6 cycles (local-store load on the SPE). That is compared to 3 cycles for the Opteron’s L1 cache. The L2 cache latency is a whopping 31 cycles on the PPE (and god only knows how high on the SPE), compared to 12-15 cycles on the Opteron. Not to mention that it has the infernal 2-cycle integer latency that the PPC970 had.
These latencies especially hurt high-level OOP code. OOP code is full of dynamic data dependencies (because so much data is accessed indirectly via pointers or pointers to pointers), and does a lot of random (as opposed to streaming) loads and stores. That makes it very hard to reorder instructions in a way, at compile time, so as to hide these high latencies. JIT’ed code is hurt even worse, becaues not only does it have to generate good code for the in-order core, but it has to do it within a few hundred milliseconds, which precludes using powerful (and time-consuming) optimization algorithms that might statically resolve some of those data dependencies.
This is not at all to badmouth Cell. Cell is an excellent design for what it is meant to do, and its weaknesses are natural engineering trade-offs given its target market. However, ultimately, Cell isn’t designed for the type of market AMD is persuing. It’s more the generalization of application-specific coprocessors than the specialization of general-purpose processors. What Cell is really great for replacing are things like soundcards and physics processing units (and maybe even GPUs, further down the line).
Edited 2006-06-02 20:30

2006-06-02 5:24 am
suryad
*drools at the idea of Java and XML processors hooked up to a dual quad core setup running at 3.2 ghz + speeds*

2006-06-02 12:51 pm
vinkelhake
*drools at the idea of Java and XML processors hooked up to a dual quad core setup running at 3.2 ghz + speeds*
Wow, you must be an IT manager

2006-06-02 7:54 pm
suryad
Not quite…just a computer nut, a student, looking for a job student actually, an avid gamer and java programmer who also happens to use a lot of xml!

2006-06-02 11:14 am
kaiwai
Considering recent acquisitions, I’m wondering if “coprocessor” means “GPU” – connecting the video directly to a hyper-transport link would have it’s advantages, and wouldn’t be any harder to program than the “non-standardized mess” that ATI and NVidea have provided so far…
Nope, think things in the server world; encryption off loading, TCP/IP off loading, and it might even be possibly to connect a GPU straight to the hypertransport as well.
The general idea, move the stuff off the processor than can be done more efficiently by a piece of hardware which has bee designed from the ground up to do a specific job.

2006-06-02 6:00 pm
butters
You’re certainly right, but I would be so quick to dismiss AMD’s interest in the “next-next-gen” console market. AMD is clearly ramping up it’s ability to economically and quickly customize its processors for specialized workloads. While phase I of the plan, the external coherent HT bus, is attractive to IT customers and system integrators, phase II implies that some of these specialized processors will have enough volume to justify their own lithographic masks.
AMD seems to be targetting 2008 for this capability, which means it will be just in time for the Big Three console makers to base their next-next-gen consoles on customized AMD processors. While the ATI aquisition rumor is total nonsense, I’m sure AMD would glady *partner* with ATI or nVidia if it meant millions of console processors. Plus, I can’t imaging the Big Three will let IBM walk away with the entire console market again, they’ll surely try to diversify in order to get a sweeter deal.