HOME

TheInfoList



OR:

A vision processing unit (VPU) is (as of 2018) an emerging class of
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
; it is a specific type of AI accelerator, designed to accelerate machine vision tasks.


Overview

Vision processing units are distinct from
video processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, ...
s (which are specialised for video encoding and decoding) in their suitability for running machine vision algorithms such as CNN (
convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s), SIFT (
Scale-invariant feature transform The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local '' features'' in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, ...
) and similar. They may include direct interfaces to take data from
cameras A camera is an optical instrument that can capture an image. Most cameras can capture 2D images, with some more advanced models being able to capture 3D images. At a basic level, most cameras consist of sealed boxes (the camera body), with a ...
(bypassing any off chip buffers), and have a greater emphasis on on-chip
dataflow In computing, dataflow is a broad concept, which has various meanings depending on the application and context. In the context of software architecture, data flow relates to stream processing or reactive programming. Software architecture Da ...
between many
parallel execution units Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different for ...
with scratchpad memory, like a manycore DSP. But, like video processing units, they may have a focus on low precision
fixed point arithmetic In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representi ...
for
image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimension ...
.


Contrast with GPUs

They are distinct from GPUs, which contain specialised hardware for rasterization and texture mapping (for 3D graphics), and whose memory architecture is optimised for manipulating bitmap images in off-chip memory (reading textures, and modifying frame buffers, with random access patterns). VPUs are optimized for performance per watt, while GPUs mainly focus on absolute performance. Target markets are
robotics Robotics is an interdisciplinarity, interdisciplinary branch of computer science and engineering. Robotics involves design, construction, operation, and use of robots. The goal of robotics is to design machines that can help and assist human ...
, the
internet of things The Internet of things (IoT) describes physical objects (or groups of such objects) with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other com ...
, new classes of digital cameras for
virtual reality Virtual reality (VR) is a simulated experience that employs pose tracking and 3D near-eye displays to give the user an immersive feel of a virtual world. Applications of virtual reality include entertainment (particularly video games), e ...
and
augmented reality Augmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory Modality (human–computer interaction), modalities, including visual, Hearing, auditory, hap ...
, smart cameras, and integrating machine vision acceleration into
smartphone A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
s and other mobile devices.


Examples


Movidius Myriad X
which is the third-generation vision processing unit in the Myriad VPU line from Intel Corporation. * Movidius Myriad 2, which finds use in Google Project Tango, Google Clips and DJI Drones * Pixel Visual Core (PVC), which is a fully programmable
Image An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
, Vision and AI processor for mobile devices * Microsoft HoloLens, which includes an accelerator referred to as a ''Holographic Processing Unit'' (complementary to its CPU and GPU), aimed at interpreting camera inputs, to accelerate environment tracking & vision for augmented reality applications. * Eyeriss, a design from MIT intended for running
convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s. * NeuFlow, a design by Yann LeCun (implemented in
FPGA A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term ''Field-programmability, field-programmable''. The FPGA configuration is generally specifi ...
) for accelerating
convolutions In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution'' ...
, using a dataflow architecture. *
Mobileye EyeQ Mobileye Global Inc. is a company developing autonomous driving technologies and advanced driver-assistance systems (ADAS) including cameras, computer chips and software. Mobileye was acquired by Intel in 2017 and went public again in 2022. Mobi ...
, by Mobileye * Programmable Vision Accelerator (PVA), a 7-way VLIW Vision Processor designed by
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
.


Similar processors

Some processors are not described as VPUs, but are equally applicable to machine vision tasks. These may form a broader category of '' AI accelerators'' (to which VPUs may also belong), however as of 2016 there is no consensus on the name: * IBM TrueNorth, a neuromorphic processor aimed at similar sensor data pattern recognition and intelligence tasks, including video/audio. * Qualcomm Zeroth Neural processing unit, another entry in the emerging class of sensor/AI oriented chips.


See also

* Adapteva Epiphany, a Manycore processor with similar emphasis on on-chip dataflow, focussed on 32-bit
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be r ...
performance. * CELL, a multicore processor with features fairly consistent with vision processing units ( SIMD instructions & datatypes suitable for video, and on-chip
DMA DMA may refer to: Arts * ''DMA'' (magazine), a defunct dance music magazine * Dallas Museum of Art, an art museum in Texas, US * Danish Music Awards, an award show held in Denmark * BT Digital Music Awards, an annual event in the UK * Doctor of M ...
between scratchpad memories). * Coprocessor *
Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mo ...
, also commonly used to run vision algorithms. NVidia's Pascal architecture includes FP16 support, to provide a better precision/cost tradeoff for AI workloads. * MPSoC * OpenCL * OpenVX * Physics processing unit, a past attempt to complement the
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
and GPU with a high throughput accelerator. * Tensor processing unit, a chip used internally by Google for accelerating AI calculations.


References


External links


Eyeriss architecture

Holographic processing unit

NeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision
{{Differentiable computing Microprocessors AI accelerators Machine vision