Quad-core processors are only the beginning of what a revitalized Intel has to offer, the company’s top executives said here Sept. 26. The chip maker will deliver in November its first quad-core processors – chips that incorporate four processors each – for both desktops and servers, said CEO Paul Otellini here, in an opening keynote speech at the Intel Developer Forum. The quad-core chips themselves will offer up to 70 percent greater performance in desktops and 50 percent in servers.
Intel and Apple: Pushing the Envelope.
Intel may have a great cpu but it’s shared bus architecture really falls behind AMD’s Hypertransport.
On servers this will hit bad, because something like GbE nics or a couple of fast SATA HDDs will choke the BUS.
Sources? I’d like to read up on that.
Do a search about Intel’s next generation bus, called CSI, which is a lot like HyperTransport. That’ll basically tell you why the old FSB architecture sucks.
I’m assuming the quad-core 2 CPUs are going to become intel’s flag ship high performance processors and downgrade the high price core 2 duo processors to the mainsteam market and make the pentium D, entry level,low budget CPU, until they release the pentium E next year.
Edited 2006-09-27 00:06
Pentium E ?!?!?!
Pentium E ?!?!?!
http://www.tgdaily.com/2006/09/20/conroe_single_core/
Be optimized for it, a lot of games as it is dont support SMP/Duel core/SLI as it is.
>> Be optimized for it, a lot of games as it is dont support SMP/Duel core/SLI as it is.
Ofcourse Be optimized BeOS for it. BeOS and Haiku loves multiple CPUs, the more the better.
Sorry guys, I just could not resist.
70% greater than what?
than the marketing figures mention last year probably
8 cores, 16 cores, 32 cores … It’s nice so hear marketing bla from Intel. As long as only a few applications really support that it’s utterly useless.
Not if you’re running BeOS
As shown 07:50 minutes into this demo:
http://video.google.co.uk/videoplay?docid=1659841654840942756
I’m far from an expert on SMP, but I don’t think I’ve seen anything in that old video that can’t be done with any SMP-ready kernel out there.
Honest question stemming from my ignorance: how come every time SMP is discussed, BeOS is brought up as a silver bullet? I seem to understand that its own GUI/window manager was heavily threaded, and that its API encouraged multithreaded development in frontends.
But regarding the CPU intensive parts of a program not written in any OS-specific special manner (let’s say the Gecko renderer, or the filters in GIMP) is there anything that BeOS does that is more SMP-friendly than other OSes? I don’t know, migrating processes from one core the other to keep the load balanced or more esoteric stuff?
Thanks to anyone that takes the time to satisfy my curiosity.
Hello Kitty
The Be API not only encourages multithreaded code, it automatically makes each window run in its own thread. This means that any program that opens a window is multithreaded. Of course this doesn’t help much if all the work of the application happens in the thread of its only window, but I believe apps on BeOS generally use these threads for user interaction only. This is in contrast to say Windows where I’m sure you have experienced menus and other GUI components hang when the app has alot of work to do.
In addition to that, the application in the linked video has two threads that only do rendering. This is application specific code and is possible on any modern OS, but I guess the BMessage centric API makes it easier to achieve. Someone with actual experience beyond reading a book on BeOS programming might be able to elaborate.
bogomipz has the answer pretty well nailed. The multi-threaded nature of the windows makes porting apps from “traditional” architectures a pain, but makes your UI feel responsive even when the app is pegging the CPU. The window’s threads are also higher priority than most apps, which helps some too.
I just finished multi-threading a linux app and I really missed the BeOS threading API… 😀
I seem to understand that its own GUI/window manager was heavily threaded, and that its API encouraged multithreaded development in frontends.
But regarding the CPU intensive parts of a program not written in any OS-specific special manner (let’s say the Gecko renderer, or the filters in GIMP) is there anything that BeOS does that is more SMP-friendly than other OSes?
Not only BeOS’s window manager was heavily threaded, but pretty much everything in BeOS is, from top to bottom. In 1990’s, it was not that common. For example, all drivers and kernel modules, file systems included, must be SMP-aware (no giant lock) and many of them even use extra threads to achieve better/smoother asynchronous I/O handling.
Kernel is fully thread-safe and use also threads itself. The BONE network stack use threading everywhere.
And, indeed, all userland servers and their corresponding libraries (BeOS “kits”, aka C++ API) are threaded. BeOS kits not only encouraged multithreaded development but in the case of GUI, its enforced on the developer.
Which, in 1990’s, unfortunatly, didn’t help this IMHO great operating system to get quickly the critical mass of graphical software ported from unix or windows code base. At this time, before SMP became mainstream, multithreaded development and parallel software programming was not that well mastered by developers. Since, things have improved. Alas, Be Inc. was too early too small to survive during these years.
Beside this pervasive threading model, BeOS have no other “exotic” SMP features.
That’s inaccurate. Many applications are threaded today. Even my toy editor runs more than 15 threads. And even if applications aren’t, the kernel can schedule processes to run on different cores. The question is not whether applications are ready, but rather whether kernels can take advantage of multiple cores. Today, most modern kernels can.
I’m really tired hearing this constant “but there are 500 threads running on my PC !” bs – almost all of them are just waiting, eating no CPU time.
I don’t see your point. The same can be said of almost all applications (processes) running on your PC. How’s that relevant to the issue at hand.
The beginning of what?
🙂 Well, the beginning of the next marketing campaign, the beginning of the rest of Intel’s days, whatever
For you, for us, we’ll believe when we see, so it’s a nothing-here-move-along effect.
70% more than whatever sounds nice, still, e.g. saying these chips will deliver 70% more performance than let’s say core2s sounds a bit, well, worthy of some doubt.
You win something with X cores if you run at least X-1 heavy applications, so your system will still be responsive. But in today use, for most users 1 core was sufficient, for geek, dual core was useful (ripping a dvd while playing a game) and for anybody that need a very responsive system. But now, 4, and more then, for not specific usage, it’s a waste of CPU and nothing else. It will be useful for servers.
(This comment consider that most applications are not efficiently threaded)
Edited 2006-09-27 11:03
i agree. i think dual cores is a pretty great thing even for standard desktops as it can make your system much more responsive when multitasking. however with 4 cores (and fast cores at that) in today’s desktops i think there is going to be a lot of people firing up their multithreaded hd video encoder and wondering why each core is only running at 60% load. they will be hitting other bottlenecks, the bus will become saturated with disk io traffic and memory accesses. i’m sure someone can crunch the numbers and get a rough estimation of the impact of this phenomenon.
the new CSI bus from intel is supposed to be their answer to this problem but it doesn’t appear to be anywhere near desktops yet. its slated to first be available for xeons and itaniums sometime in 2007.
however with 4 cores (and fast cores at that) in today’s desktops i think there is going to be a lot of people firing up their multithreaded hd video encoder and wondering why each core is only running at 60% load.
“Test results with the software packages Main Concept with H.264 encoding and the WMV-HD conversion make this very clear. We noticed performance jumps of up to 80% when compared to the Core 2 Duo at the same clock speed (2.66 GHz).”
http://www.tomshardware.com/2006/09/10/four_cores_on_the_rampage/pa…
they will be hitting other bottlenecks, the bus will become saturated with disk io traffic and memory accesses. i’m sure someone can crunch the numbers and get a rough estimation of the impact of this phenomenon.
“Our test results reveal that a FSB1333 (true 333 MHz) does not entail advantages – at least not based on the tests at a CPU clock speed of 2.66 GHz. At CPU clock speeds of 3.0 GHz and above, and memory speeds beyond the DDR2-1000 mark (true 500 MHz), the FSB1333 shows what it is capable of.
One should not forget – viewed from the perspective of the Pentium 4 – that the Core 2 micro-architecture offers a few features to ease the strain on memory access, whereby higher FSB or memory speeds barely register any speed advantages.”
http://www.tomshardware.com/2006/09/10/four_cores_on_the_rampage/pa…
Problem is you are assuming large data flows to each CPU, there are programs that the more CPUs you have the less data needed to be sent to per CPU.
Example: The other day I wanted to resize a collecting of pictures I have (using TAR on BeOS). A quick rough resize is done as fast as I move the mouse, but the smooth-resize takes a few seconds per picture. There is no reason that the picture could not be broken up in 80 or more overlapping pieces and each piece sent to a separate CPU for processing. When you look at the amount of data moved into the CPUs, there is not that much, most of the pictures were less that a meg in size.
Note: following the numbers are just examples not real wold measurements.
If a modern single CPU processes a picture a second, we have to move 2 MBytes/sec on the buss. If we have 80 CPUs of the same power, we have to move 160 MBytes/sec to keep up. This is not even close to what a modern buss can do. For image processing the code for any one function is relatively very small, it will run out of the local cache.
I don’t mean to say the above are the true real world figures but I do know in the graphic and photo that doing a lot of simple functions in parallel on a chunk of data makes a lot of sense.
And there are a lot of graphic and photo people out there.
These systems are for pro’s and enthusiasts. I may not always be utilizing the full power of my system, but when I do, I’m glad I have it.
Here’s an example, lets say that for one of my clients I have to process 5 large datasets, encrypt the output and take a hash of the encrypted output. The script below would simulate this excersise:
In bash (on one line):
t1=`date +%s`; r=0; while [ $r -le 4 ]; do openssl rand 1234567890 -base64 | openssl enc -bf -k guess | openssl sha1 > /dev/null & r=$((r+1)); done; wait; t2=`date +%s`; t3=$((t2-t1)); echo $t3
On my MacPro here are the results:
4 cpu’s = 139 seconds
2 cpu’s = 269 seconds
1 cpu = 550 seconds
While this work is occuring, it is nice to be able to still use the system for e-mail, writing documentation, web research, listening to some tunes, etc…
Bert
Edited 2006-09-27 13:33