
Tesla is the codename for a GPU
microarchitecture
In electronics, computer science and computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as μarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular ...
developed by
Nvidia
Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
, and released in 2006, as the successor to
Curie Curie may refer to:
*Curie family, a family of distinguished scientists:
:* Jacques Curie (1856–1941), French physicist, Pierre's brother
:* Pierre Curie (1859–1906), French physicist and Nobel Prize winner, Marie's husband
:* Marie Curi ...
microarchitecture. It was named after the pioneering electrical engineer
Nikola Tesla
Nikola Tesla (;["Tesla"](_blank)
. ''Random House Webster's Unabridged Dictionary''. ; 10 July 1856 – 7 ...
. As Nvidia's first microarchitecture to implement unified shaders, it was used with
GeForce 8 series,
GeForce 9 series,
GeForce 100 series,
GeForce 200 series, and
GeForce 300 series of GPUs, collectively manufactured in
90 nm,
80 nm,
65 nm,
55 nm, and
40 nm. It was also in the
GeForce 405 and in the
Quadro FX, Quadro x000, Quadro NVS series, and
Nvidia Tesla
Nvidia Tesla is the former name for a line of products developed by Nvidia targeted at stream processing or GPGPU, general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began us ...
computing modules.
Tesla replaced the old
fixed-pipeline microarchitectures, represented at the time of introduction by the
GeForce 7 series. It competed directly with AMD's first unified shader microarchitecture named
TeraScale, a development of ATI's work on the
Xbox 360
The Xbox 360 is a home video game console developed by Microsoft. As the successor to the Xbox (console), original Xbox, it is the second console in the Xbox#Consoles, Xbox series. It was officially unveiled on MTV on May 12, 2005, with detail ...
which used a similar design. Tesla was followed by
Fermi.
Overview
Tesla is Nvidia's first microarchitecture implementing the
unified shader model. The driver supports
Direct3D 10 Shader Model 4.0 /
OpenGL
OpenGL (Open Graphics Library) is a Language-independent specification, cross-language, cross-platform application programming interface (API) for rendering 2D computer graphics, 2D and 3D computer graphics, 3D vector graphics. The API is typic ...
2.1 (later drivers have OpenGL 3.3 support) architecture. The design is a major shift for NVIDIA in GPU functionality and capability, the most obvious change being the move from the separate functional units (pixel shaders, vertex shaders) within previous GPUs to a homogeneous collection of universal
floating point
In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a signed sequence of a fixed number of digits in some base) multiplied by an integer power of that base.
Numbers of this form ...
processors (called "stream processors") that can perform a more universal set of tasks.
GeForce 8's unified shader architecture consists of a number of
stream processors (SPs). Unlike the
vector processing
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its Instruction (computer science), instructions are designed to operate efficiently and effectively on large Array d ...
approach taken with older shader units, each SP is
scalar and thus can operate only on one component at a time. This makes them less complex to build while still being quite flexible and universal. Scalar shader units also have the advantage of being more efficient in a number of cases as compared to previous generation
vector
Vector most often refers to:
* Euclidean vector, a quantity with a magnitude and a direction
* Disease vector, an agent that carries and transmits an infectious pathogen into another living organism
Vector may also refer to:
Mathematics a ...
shader units that rely on ideal instruction mixture and ordering to reach peak throughput. The lower maximum throughput of these scalar processors is compensated for by efficiency and by running them at a high clock speed (made possible by their simplicity). GeForce 8 runs the various parts of its core at differing clock speeds (clock domains), similar to the operation of the previous
GeForce 7 series GPUs. For example, the stream processors of GeForce 8800 GTX operate at a 1.35 GHz clock rate while the rest of the chip is operating at 575 MHz.
[Wasson, Scott]
NVIDIA's GeForce 8800 graphics processor
, Tech Report, 8 November 2007.
GeForce 8 performs significantly better
texture filtering
In computer graphics, texture filtering or texture smoothing is the method used to determine the texture color for a Texture mapping, texture mapped pixel, using the colors of nearby Texel (graphics), texels (ie. pixels of the texture).
Filtering ...
than its predecessors that used various optimizations and visual tricks to speed up rendering without impairing filtering quality. The GeForce 8 line correctly renders an angle-independent
anisotropic filtering
In 3D computer graphics, anisotropic filtering (AF) is a technique that improves the appearance of Texture filtering, textures, especially on surfaces viewed at sharp Viewing angle, angles. It helps make textures look sharper and more detailed ...
algorithm along with full
trilinear texture filtering. G80, though not its smaller brethren, is equipped with much more texture filtering arithmetic ability than the GeForce 7 series. This allows high-quality filtering with a much smaller performance hit than previously.
NVIDIA has also introduced new polygon edge
anti-aliasing methods, including the ability of the GPU's
ROPs to perform both
Multisample anti-aliasing
Multisample anti-aliasing (MSAA) is a type of spatial anti-aliasing, a technique used in computer graphics to remove jaggies.
It is an optimization of supersampling, where only the necessary parts are sampled more. Jaggies are only noticed in ...
(MSAA) and HDR lighting at the same time, correcting various limitations of previous generations. GeForce 8 can perform MSAA with both FP16 and FP32 texture formats. GeForce 8 supports 128-bit
HDR rendering, an increase from prior cards' 64-bit support. The chip's new anti-aliasing technology, called coverage sampling AA (CSAA), uses Z, color, and coverage information to determine final pixel color. This technique of color optimization allows 16X CSAA to look crisp and sharp.
[Sommefeldt, Ry]
NVIDIA G80: Image Quality Analysis
Beyond3D, 12 December 2006.
Performance
The claimed theoretical
single-precision processing power for Tesla-based cards given in
FLOPS
Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations.
For such cases, it is a more accurate measu ...
may be hard to reach in real-world workloads.
In G80/G90/GT200, each Streaming Multiprocessor (SM) contains 8 Shader Processors (SP, or Unified Shader, or
CUDA
In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...
Core) and 2 Special Function Units (SFU). Each SP can fulfill up to two single-precision operations per clock: 1 Multiply and 1 Add, using a single
MAD instruction. Each SFU can fulfill up to four operations per clock: four MUL (Multiply) instructions. So one SM as a whole can execute 8 MADs (16 operations) and 8 MULs (8 operations) per clock, or 24 operations per clock, which is (relatively speaking) 3 times the number of SPs. Therefore, to calculate the theoretical dual-issue MAD+MUL performance in floating point operations per second
sp+sfu'', GigaFLOPS">GFLOPS">GigaFLOPS.html" ;"title="'FLOPS
sp+sfu'', GigaFLOPS">GFLOPSof a graphics card with SP count [''n''] and shader frequency [''f'', GHz], the formula is: ''FLOPS
sp+sfu = 3 × n × f''.
However leveraging dual-issue performance like MAD+MUL is problematic:
* Dual-issuing the MUL is not available in graphics mode on G80/G90,
[Sommefeldt, Rys]
NVIDIA G80: Architecture and GPU Analysis - Page 11
Beyond3D, 8 November 2006 though it was much improved in GT200.
* Not all combinations of instructions like MAD+MUL can be executed in parallel on the SP and SFU, because the SFU is rather specialized as it can only handle a specific subset of instructions: 32-bit floating point multiplication, transcendental functions, interpolation for parameter blending, reciprocal, reciprocal square root, sine, cosine, etc.
* The SFU could become busy for many cycles when executing these instructions, in which case it is unavailable for dual-issuing MUL instructions.
For these reasons, in order to estimate the performance of real-world workloads, it may be more helpful to ignore the SFU and to assume only 1 MAD (2 operations) per SP per cycle. In this case the formula to calculate the theoretical performance in floating point operations per second becomes: ''FLOPS
sp = 2 × n × f''.
The theoretical
double-precision processing power of a Tesla GPU is 1/8 of the single precision performance on GT200; there is no double precision support on G8x and G9x.
Video decompression/compression
NVDEC
NVENC
NVENC was only introduced in later chips.
Chips
* G80
* G84
* G86
* G92
* G92B
* G94
* G94B
* G96
* G96B
* G96C
* G98
* C77
* C78
* C79
* C7A
* C7A-ION
* ION
* GT200
* GT200B
* GT215
* GT216
* GT218
* C87
* C89
See also
*
List of eponyms of Nvidia GPU microarchitectures
*
List of Nvidia graphics processing units
This list contains general information about graphics processing units (GPUs) and video cards from Nvidia, based on official specifications. In addition some Comparison of Nvidia nForce chipsets, Nvidia motherboards come with integrated onboard GP ...
*
CUDA
In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...
*
Scalable Link Interface (SLI)
*
Qualcomm Adreno
References
External links
{{NVIDIA
GPGPU
Nvidia Tesla
Nvidia Tesla is the former name for a line of products developed by Nvidia targeted at stream processing or GPGPU, general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began us ...
Nvidia microarchitectures
Parallel computing
Graphics cards