“Using a GPU for computational workloads is not a new concept. The first work in this area dates back to academic research in 2003, but it took the advent of unified shaders in the DX10 generation for GPU computing to be a plausible future. Around that time, Nvidia and ATI began releasing proprietary compute APIs for their graphics processors, and a number of companies were working on tools to leverage GPUs and other alternative architectures. The landscape back then was incredibly fragmented and almost every option required a proprietary solution – either software, hardware or both. Some of the engineers at Apple looked at the situation and decided that GPU computing had potential – but they wanted a standard API that would let them write code and run on many different hardware platforms. It was clear that Microsoft would eventually create one for Windows (ultimately DirectCompute), but what about Linux, and OS X? Thus an internal project was born, that would eventually become OpenCL.”
I’m pleasantly surprised to read the part stating that multi-core cpu’s are also targeted by OpenCL.
Sweet!
Yes, OpenCL is really nice. It’s really a front-end to many different architectures. You program in OpenCL and it can run on GPU (nvidia and ati), CPU (amdstream), or other kind of processors. No need for porting.
The resulting programs might not be as fast as Nvidia’s CUDA (I haven’t seen anything proving that though) but at least it can run _everywhere_
Except on OS’s that don’t have OpenCL available, which is everything other than the Big 3.
Yes, it’s still a new technology. The 1.1 specification is just 5 months old, and the 1.0 is a year old. It’s not mature like MPI, but then MPI has been there for something like 2 decades…
I’ve got OpenCL from Nvidia running on Debian as they packaged in Experimental.
Doesn’t Debian belong to “Big 3” as a linux distro ? I thought it was a shortcut for Windows/OSX/Linux…
Yeah, that’s what I meant.
Interesting. Alot of developments in this area.
Few months ago nvidia also announced its porting of the CUDA platform to x86
http://arstechnica.com/business/news/2010/09/nvidia-ports-its-cuda-…
I wonder how the integrated GPU/CPU chips coming out next year (Intel’sandy Bridge and AMD Fusion) is going to affect these developments.
Hum interesting. That’s why Nvidia’s drivers do not support OpenCL on the CPU. They still want to lock people in with CUDA.
What is worse in my opinion is that OpenGL and OpenCL describe the drivers and not the hardware. If they could become real HW standards or provide standard graphics / acceleration hardware interfaces that are vendor /os -independent (it means no vendor and os specific drivers) then we could see a real revolution. You could have acceleration out of the box without driver installation. OS could provide everything irregardless of the chipset. At least for me OpenCL is a tremendous opportunity to utilize a 6-core phenom that is fully documented. Hardware must be designed according to standards and not standards according to drivers.
Couldn’t agree more… Though being a hobby OS developer might result in some bias in that area
Seriously, why should HW vendors be trusted to provide entire parts of the operating system in the form of (bloated) drivers, when they could just follow a standard spec in terms of hardware/software interface, and (re)write the spec when it’s not good enough for them ?
Edited 2010-12-13 18:40 UTC
From what I understand when it comes to sandy bridge from Intel OpenCL will be based on AVX extensions that should provide the sorts of performance one would normally get from a dedicated GPU. There are rumours that maybe Apple will consider AMD but I think those are premature and misplaced rumours because even though AMD has made great strides when it comes to battery life Intel still has the crown in that area.
While these integrated chips are on a single package, theyr’e still two discrete components being placed inside a single box. From the OpenCL perspective, whether the devices are integrated or on discretely seperate PCIe bus lanes, the programming API is unchanged. The OS and the framework still see them as two logically seperate devices. When retrieving the list of available OpenCL devices, you’d get get a CPU device and a GPU device. The physical integration into a single package is invisible to the API.
Yeah.
Though that does not mean that you should not create different versions for each device(-type), be it by auto-generating code or by hand-crafting it.
E.g. each device has a prefered vector size, using that will make things faster. Further each system has a different amount of local memory etc.
What I find great is that it is relatively easy to write a kernel and with a little amount of time it is quite fast. Only writing the boiler plate code sucks, though the bindings can help there.
Edited 2010-12-12 21:20 UTC
Exactly, OpenCL is really not that portable when it comes to optimized kernels. A lot of templating is necessary.
I.e. good performing code in ATI GPUs will not necessarily perform efficiently on NVIDIA parts, and viceversa.
The most “portable” of these technologies, ironically, is CUDA. Granted it is only portable across NVIDIA architectures, and now x86 CPUs.
The biggest issue with these sort of tools is that as long as the GPUs etc are on different address spaces, the programming models will continue to be hindered significantly.
For all the boo ha about Apple’s closed system and standards, they do a lot of good work in open source and open standards. OpenCL is great. I really hope it takes flight.