Post a Comment
that use this tech on Linux, BSD, or windows. when will IBM make this tech work in the 970 class proc?
OpenMP works best on programs where you have a dataset that can be split into chunks and then the chunks assigned to worker threads. Much "multimedia" is great for OpenMP because you can easily segment the data.
What is problematic though is the software language side support for doing high-performance threading is immature compared to the compiler and some of the high-performance libraries (OpenMP, MPI, et al).
We have many C++ programming frameworks that are not even thread-safe much less amenable to thread-based optimization. Most current C++ GUI frameworks are classic examples of frameworks that were not designed with high-performance threading in mind.
As the hardware gets more and more evolved threading support, I would expect to see languages start tracking these developments and we will see new advanced parallelism constructs in our familiar languages.
To date, I know of only Erlang has having implemented pervasive multi-processing.
good effort but it's not in that 50-100 percent speed increase range where you say "wow, real cool".
It's nice.
Think again...:
An Intel P4 2800 costs 439 E, a 3,06 costs 699 E - now do the math and figure how many percent that is for the little increase in speed. I have seen videos from THG where two systems running head to head with video applications are equally fast. One of them is a plain 3,6 GhZ P4, the other 3,06 with HT enabled. Now, what does this tell us? In the above case you pay more than 70% extra for only 7% more CPU-power. With HT, you get 20% free and you don't care..? - So be it..
I thought pseudo-code should be written so it was easily readable. 
Read on the last page as to who is who. There are 5 of them, there was no space in the db field to mention all of them by name.
> Who is that ?
Wonders too, who uses OSNews as intel PR
</rant>
To me that looks a bit too "scientific" for the average OSNews reader, but maybe I'm wrong
I'll definitely read it all sometime.
I really wonder what this thing would give with a multithreading-crazy BeOS (where we don't need optimizing compilers)...
Btw, I recently noticed VideoLan Client on BeOS was even more multithreaded than native media players
(is it too on other platforms ?)
>To me that looks a bit too "scientific" for the average OSNews reader
I don't think so. Supposedly most of our readers are actually programmers/engineers:
http://www.osnews.com/story.php?news_id=2037
I read the first 2 pages, and then decided that i will try to read it again some other time when i can take some more time to digest it.
some more lay-men explanation with the examples
would have been nice though.
om a side note: i would very much like to run BeOS or OpenBeOS on a dual CPU hyperthreading system. for example dual XEON or so. this would allow 4 threads in parallel.
I already use BeOS on my dual PIII and it rocks.
on the other hand it might be worth waiting for XEON 32/64 bit. i still think intel will release 32 bit compatible CPU's once AMD starts selling them. they had better, because i will not fork out 4000$ for a single CPU itanium2.
Int
This looks like a draft for a peer-reviewed journal paper, and hence targeted at a different audience than me and presumably a lot of others. Can't comment on the facts as I got lost on the 2nd para! I'm sadly not a computer scientist. I've got no problems with such stuff appearing on OSNEWS though - makes a break from looking at log files :-)
If you want to know what hyperthreading is all about, in understandable English, read the great article at Ars Technica: http://arstechnica.com/paedia/h/hyperthreading/hyperthreading-1.htm...
> >To me that looks a bit too "scientific" for the average OSNews reader
> I don't think so. Supposedly most of our readers are actually programmers/engineers:
Yes, though this looks more like a Ph.D. paper
(don't have anything against that btw)
Page 5 lists the references and authors. The article was written by 5 PhD's (to include other degrees). Intel has always been very good about providing in depth documentation about their microprocessor architecture. You have to wonder about it's usefulness to AMD sometimes.
To me that looks a bit too "scientific" for the average OSNews reader, but maybe I'm wrong
Don't let the math formulas fool you, these kind of scientic articles always have them but no one really reads them, unless of course there's no source code examples and we really have to 
I am surprized that OpenMP helps. It would seem the best case would be two instruction streams that are not related. OpenMP is usually used to create threads doing the same operations. In this case it is would be seem that they would be competing for the same resource. Perhaps this make up for the lack of registers in a p4. Does having the second state allow more data to flow to the same resources? Anyone know?
I did not see mention of the negatives. Is it just die space or do single streams get a performance/latency hit?
I would imagine with the poor state of smp in most OSS kernels that pretending to have two processors could easily more then make up for that performance increase.
I have seen lots of tests where 2 processors slows the linux kernel down instead of speed it up.
But perhaps on a HPTC machine having a separate virtual processor to handle os requests might not be too bad.
Anyone know what the big p4 Xeon linux clusters do about hyper threading?
Lets take a close look at their results, referring to figures 13 and 14. The hyperthreading is giving them at best a 13% speed boost over the non-hyperthreading scenario. This is evident in the single processing case. The inherent parallelism of the operation is evident by the fact that they get a nearly factor of 2 speed improvement in the dual processor case. The speedup in the hyperthreaded dual processors is simply an aggregate of the ten percent speed gains within each processor. What Figure 13 is therefore showing is that even in cases where parallelism is excellent in the algorithm, by evidence of the boost in the DP score, we still only get marginal speed improvements with hyperthreading.
Figure 14 shows an inherent problem with trying to fool the system into thinking there are four processors instead of only two as well. As it states, the algorithm is really only working on three processes simultaneously. The system, believing it has four full fledge processors, is therefore inefficiently distributing idle tasks among the two physical processors, in deference to the four simulated processors. This therefore shows that there can be a functional decrease in speed in a hyperthreaded system. The single processor hyperthreaded case for the algorithm used in Figure 14 did perform very well, but again it is evident that the algorithm itself lends itself to parallelism, by looking at the dual processor case.
This article therefore highlights two things in my mind:
1. OpenMP is effective in parallelizing algorithms "on the fly" so to speak.
2. Hyperthreading does increase performance, but not substantially.
Are there articles on simultaneous thread executions on completely different computations, rather than functionally parallel threads. For example, what kind of speed up would there be if one thread was doing the SVM calculation and the other was doing the AVSR one? Better still, what would happen if we distributed two threads for SVM execution and three for AVSR? Interesting thoughts....
Does any one know it AMD has any tools for openmp in works?
Anyone else owrking on this? Will intel Compiler (in 32 bit only:( will work on Optheron?
Thanx




