By 2010 supercomputers could be carrying out more than 1,000 trillion calculations per second. The ambitious goal has been set by the US Government to help its scientists tackle problems that would otherwise take too long to simulate.
By 2010 supercomputers could be carrying out more than 1,000 trillion calculations per second. The ambitious goal has been set by the US Government to help its scientists tackle problems that would otherwise take too long to simulate.
Just get an Apple supercomputer.
Or you could build a Beowulf cluster using BEOS and a dumpster full of commodore 64’s.
“Mr Scott said tasks such as
simulating the effects of car
crashes on the human body,
discovering new drugs,
modelling protein folding
and predicting climate
change will all require
petaflop computing.”
It has been demonstrated already that atleast some of these tasks, like protein folding, drug discovery, & bioinformatics, certainly crypto analysis are far better off running on FPGA boards at 10-1000 times lower cost than just cramming big iron/$ cpus into a big box.
As for the other problems, its a matter of time before the computer scientists in those fields discover that C code can be turned into HW, but it sure helps to know how to use par & seq. At the heart of most of these problems is the big multi level nested loop with some nice math inside. If you can find that, you have a nice HW solution waiting, like the inner loop can be 1 clock cycle at say 50MHz or more instead of x lines of C. If you can’t find it, then sometimes unoptimising ie replacing quicksorts by merge or even bubble sorts actually pays off better in HW.
So far the CS professors have done a wonderful job of teaching folks how to write complex optimised algorithms that are great on single threaded cpus but lousy at being turned into HW or run on closely coupled processor arrays.
In case anyone doesn’t know, there are atleast a handfull of semiconductor companies that have developed chips with 256 or more cpus, but nobody seems to know what to do with them. Also the newest largest FPGAs can also host large no of soft cpus which combined with FPGA logic fabric can do far more than GHz of P4s. The only remaining weakness of FPGAs is floating point, they don’t.
end of rant
FPGA is a cool complement to traditional supercomputing.
There seem to be at least two types of supercomputer tasks – the tasks that can be divided into discrete isolated subtasks and distributed (seti@home, folding, drug discovery as you mentioned); and the tasks that can’t (nuclear simulation, car crashes, weather forecasts – anything “simulation” really).
Does FPGA scale to simulating the world’s weather at any reasonable resolution? How does the interconnects work?
I can imagine placing the logic cells geographically, but then you’d need a spherical FPGA..?
I can’t say that I know alot about weather simulation or crashing cars or nuclear reactions or galactic collisions and how those simulators actually compute those problems in detail.
But I can say that mother nature knows how to do this very well & has been doing so for eons, it uses infinite parallelism on atomic scale and simple basic laws of physics as if anybody didn’t know.
I can postulate from my own knowledge of basic physics that weather prediction, nuclear reactions, galactic collisions are all modelled esentially the same way as lots of interacting 3d neighbouring processes running on a global world clock either mins, fs or millenia.
These processes divvy up the world/universe into the smallest possible elements for most accurate results. Each process can then be run time shared on a single cpu, or a zillion or on a N cpu chip. It can also be turned into a dedicated HW engine with connections to all nearest neighbours. Each element in the grid usually doesn’t see past a few neighbours in most of these problems. Depending on how much computing each engine must perform determines how many you could pack into each FPGA. It may not be worth doing so unless the no of engines per FPGA overcomes the inherent slower clock speed of FPGA, currently 10x penalty. Also Floating Point is not available but there are work arounds for that long known to the DSP folks.
If the nature of a supercomp problem isn’t that of zillions of grid elements or the elements dont perform in a way where similar work is done on each node at each pass, it is best left to traditional cpus.
If the earth modelling was done in FPGA, the sperical nature of the grid would be the least of the engineering problems to be solved.
To see what the Bioinformatics people are doing, you could look at
http://www.timelogic.com
or Google bio fpga
that was pretty much my understanding too, although I have read some articles about weather forecasting computation and nuclear simulation. The problem seems to be the way that cells interact with neighbours – and the interconnects; hence I wondered if a physically spherical FPGA would help..?
We do mesoscale atmospheric modeling using the Regional Atmospheric Modeling System (RAMS) here. RAMS is primarily a Fortran program which uses many complex interacting subsystems. Modeling is done on a 3 dimensional set of grids over a specified period of time. Due to the complexity of the interactions between the various subsystems, I’d say that atmospheric modeling isn’t a problem well suited to FPGAs.
Currently the world’s most powerful supercomputer is used for atmospheric modeling.
Well I certainly wouldn’t want to argue with someone who is actually working on a weather supercomp problem, but I wouldn’t dismiss the FPGAs so quickly unless it’s been tried & failed. Now weather prediction on supercomps has one extra good reason to stay where it is, and that is the US gov and a few others can easily afford such infinity cpus and the results are generally shared with the world right?. Also the US & Japan have a great deal to gain by pursuing the worlds fastest supercomps no matter what the use for national prestige, so they aren’t going away.
Many Bioinformatics folks would say the same thing about their field & that world is full of Phd’s who don’t generally know what the possibilities are on the HW side but are quite comfortable with Fortran/C. The big difference here is that FPGA systems have already proved themselves on problems that are not particularly ASIC like and the Bio market definately doesn’t share results and most drug discovery companies will have to buy one or the other solution at a rate far faster than Moores law.
In the past I have been given the task of turning Matlab & Fortran codes into ASICs, usually they are written in such a way as to be impossible or impractical to easily map to HW. Reworking the code with the original author to use equivalent and often far less or more efficent but more structured engines allows the HW breakthrough to occur. Once the engineers can understand the basics, the rest is just grunt work. This usually follows the 80/20 or 90/10 rule, take only the small part of the code that takes all the time and leave the rest alone.
Also engineers are a funny bunch, tell them it can’t be done and you will likely stir somebody to tackle an impossible problem.
never say never
this is v. cool. Usually interesting discussions take place between interested but uninformed people who have read a few journals (like me).
But here we have two informed people. Maybe a discussion can happen that can be turned into an osnews item?