BrookGPU
   HOME

TheInfoList



OR:

The Brook programming language and its implementation BrookGPU were early and influential attempts to enable
general-purpose computing on graphics processing units General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditiona ...
. Brook, developed at
Stanford University Stanford University, officially Leland Stanford Junior University, is a private research university in Stanford, California. The campus occupies , among the largest in the United States, and enrolls over 17,000 students. Stanford is consider ...
graphics group, was a compiler and runtime implementation of a
stream programming In computer science, stream processing (also known as event stream processing, data stream processing, or distributed stream processing) is a programming paradigm which views data streams, or sequences of events in time, as the central input and ou ...
language targeting modern, highly parallel
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s such as those found on
ATI Ati or ATI may refer to: * Ati people, a Negrito ethnic group in the Philippines **Ati language (Philippines), the language spoken by this people group ** Ati-Atihan festival, an annual celebration held in the Philippines *Ati language (China), a ...
or
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
graphics cards. BrookGPU compiled programs written using the Brook stream programming language, which is a variant of
ANSI C ANSI C, ISO C, and Standard C are successive standards for the C programming language published by the American National Standards Institute (ANSI) and ISO/IEC JTC 1/SC 22/WG 14 of the International Organization for Standardization (ISO) and the ...
. It could target
OpenGL OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve hardwa ...
v1.3+,
DirectX Microsoft DirectX is a collection of application programming interfaces (APIs) for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with "Direct", ...
v9+ or AMD's
Close to Metal In computing, Close To Metal ("CTM" in short, originally called ''Close-to-the-Metal'') is the name of a beta version of a low-level programming interface developed by ATI, now the AMD Graphics Product Group, aimed at enabling GPGPU computing. CT ...
for the computational backend and ran on both
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
and
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
. For debugging, BrookGPU could also
simulate A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of Conceptual model, models; the model represents the key characteristics or behaviors of the selected system or proc ...
a virtual graphics card on the CPU.


Status

The last major beta release (v0.4) was in October 2004 but renewed development began and stopped again in November 2007 with a v0.5 beta 1 release. The new features of v0.5 include a much upgraded and faster
OpenGL OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve hardwa ...
backend which uses framebuffer objects instead of PBuffers and harmonised the code around standard OpenGL interfaces instead of using proprietary vendor extensions.
GLSL OpenGL Shading Language (GLSL) is a high-level shading language with a syntax based on the C programming language. It was created by the OpenGL ARB (OpenGL Architecture Review Board) to give developers more direct control of the graphics pipeli ...
support was added which brings all the functionality (complex branching and loops) previously only supported by DirectX 9 to OpenGL. In particular, this means that Brook is now just as capable on
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
as
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
. Other improvements in the v0.5 series include multi-backend usage whereby different threads can run different Brook programs concurrently (thus maximising use of a multi-GPU setup) and SSE and
OpenMP OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating syste ...
support for the CPU backend (this allows near maximal usage of modern CPUs).


Performance comparison

A like for like comparison between desktop CPUs and GPGPUs is problematic because of algorithmic & structural differences. For example, a 2.66 GHz
Intel Core 2 Duo Intel Core is a line of streamlined midrange consumer, workstation and enthusiast computer central processing units (CPUs) marketed by Intel Corporation. These processors displaced the existing mid- to high-end Pentium processors at the time o ...
can perform a maximum of 25
GFLOPs In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
(25 billion single-precision floating-point operations per second) if optimally using SSE and streaming memory access so the prefetcher works perfectly. However, traditionally (due to shader program length limits) most GPGPU kernels tend to perform relatively small amounts of work on large amounts of data in parallel, so the big problem with directly executing GPGPU algorithms on desktop CPUs is vastly lower memory bandwidth as generally speaking the CPU spends most of its time waiting on
RAM Ram, ram, or RAM may refer to: Animals * A male sheep * Ram cichlid, a freshwater tropical fish People * Ram (given name) * Ram (surname) * Ram (director) (Ramsubramaniam), an Indian Tamil film director * RAM (musician) (born 1974), Dutch * ...
. As an example, dual-channel PC2-6400 DDR2 RAM can throughput about 11 Gbit/s which is around 1.5 GFLOPs maximum given that there is a total of 3 GFLOPs total bandwidth and one must both read and write. As a result, if memory bandwidth constrained, Brook's CPU backend won't exceed 2 GFLOPs. In practice, it's even lower than that most especially for anything other than float4 which is the only data type which can be SSE accelerated. On an ATI HD 2900 XT (740 MHz core 1000 MHz memory), Brook can perform a maximum of 410 GFLOPs via its DirectX 9 backend. OpenGL is currently (due to driver and Cg compiler limitations) much less efficient as a GPGPU backend on that GPU, so Brook can only manage 210 GFLOPs when using OpenGL on that GPU. On paper, this looks like around twenty times faster than the CPU, but as just explained it isn't as easy as that. GPUs currently have major branch and read/write access penalties so expect a reasonable maximum of one third of the peak maximum in real world code - this still leaves that ATI card at around 125 GFLOPs some five times faster than the Intel Core 2 Duo. However this discounts the important part of transferring the data to be processed to and from the GPU. With a
PCI Express PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard, designed to replace the older PCI, PCI-X and AGP bus standards. It is the common ...
1.0 x8 interface, the memory of an ATI HD 2900 XT can be written to at about 730 Mbit/s and read from at about 311 Mbit/s which is significantly slower than normal PC memory. For large datasets, this can greatly diminish the speed increase of using a GPU over a well-tuned CPU implementation. Of course, as GPUs become faster far more quickly than CPUs and the PCI Express interface improves, it will make more sense to offload large processing to GPUs.


Applications and games that use BrookGPU

*
Folding@home Folding@home (FAH or F@h) is a volunteer computing project aimed to help scientists develop new therapeutics for a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements ...


See also

*
GPGPU General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditiona ...
*
CUDA CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ca ...
*
Close to Metal In computing, Close To Metal ("CTM" in short, originally called ''Close-to-the-Metal'') is the name of a beta version of a low-level programming interface developed by ATI, now the AMD Graphics Product Group, aimed at enabling GPGPU computing. CT ...
*
OpenCL OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-progra ...
*
Lib Sh Sh was an early metaprogramming language for programmable GPUs. It offered a general-purpose programming language, following a stream-processing model. Programs written in Sh could either run on CPUs or GPUs, obviating the need to write programs i ...
*
Intel Ct Intel Ct is a Parallel programming model, programming model developed by Intel to ease the exploitation of its future Multi-core processor, multicore chips, as demonstrated by the Intel Tera-Scale, Tera-Scale research program. It is based on the ex ...


References

{{reflist


External links


Official BrookGPU website
- Stanford University's BrookGPU website GPGPU GPGPU libraries