A vision processing unit (VPU) is (as of 2023) an emerging class of
microprocessor
A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...
; it is a specific type of
AI accelerator
A neural processing unit (NPU), also known as AI accelerator or deep learning processor, is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence (AI) and machine learning applications, inc ...
, designed to
accelerate machine vision
Machine vision is the technology and methods used to provide image, imaging-based automation, automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision ...
tasks.
Overview
Vision processing units are distinct from
graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
s (which are specialised for
video encoding and decoding) in their suitability for running
machine vision algorithms such as CNN (
convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
s), SIFT (
scale-invariant feature transform
The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local '' features'' in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, ...
) and similar.
They may include
direct interfaces to take data from
cameras
A camera is an instrument used to capture and store images and videos, either digitally via an electronic image sensor, or chemically via a light-sensitive material such as photographic film. As a pivotal technology in the fields of photograp ...
(bypassing any off chip buffers), and have a greater emphasis on on-chip
dataflow
In computing, dataflow is a broad concept, which has various meanings depending on the application and context. In the context of software architecture, data flow relates to stream processing or reactive programming.
Software architecture
Dat ...
between many
parallel execution units with
scratchpad memory, like a
manycore DSP. But, like video processing units, they may have a focus on
low precision fixed point arithmetic for
image processing
An image or picture is a visual representation. An image can be two-dimensional, such as a drawing, painting, or photograph, or three-dimensional, such as a carving or sculpture. Images may be displayed through other media, including a pr ...
.
Contrast with GPUs
They are distinct from
GPUs, which contain specialised hardware for
rasterization
In computer graphics, rasterisation (British English) or rasterization (American English) is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (a series of pixels, dots or lines, whic ...
and
texture mapping
Texture mapping is a term used in computer graphics to describe how 2D images are projected onto 3D models. The most common variant is the UV unwrap, which can be described as an inverse paper cutout, where the surfaces of a 3D model are cut ap ...
(for
3D graphics
3D computer graphics, sometimes called CGI, 3D-CGI or three-dimensional computer graphics, are graphics that use a three-dimensional representation of geometric data (often Cartesian) that is stored in the computer for the purposes of perfor ...
), and whose
memory architecture
Memory architecture describes the methods used to implement electronic computer data storage in a manner that is a combination of the fastest, most reliable, most durable, and least expensive way to store and retrieve information. Depending on the ...
is optimised for manipulating
bitmap images in
off-chip memory (reading
textures, and modifying
frame buffers, with
random access patterns). VPUs are optimized for performance per watt, while GPUs mainly focus on absolute performance.
Target markets are
robotics
Robotics is the interdisciplinary study and practice of the design, construction, operation, and use of robots.
Within mechanical engineering, robotics is the design and construction of the physical structures of robots, while in computer s ...
, the
internet of things
Internet of things (IoT) describes devices with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other communication networks. The IoT encompasse ...
(IoT), new classes of
digital cameras
A digital camera, also called a digicam, is a camera that captures photographs in digital memory. Most cameras produced today are digital, largely replacing those that capture images on photographic film or film stock. Digital cameras are now ...
for
virtual reality
Virtual reality (VR) is a Simulation, simulated experience that employs 3D near-eye displays and pose tracking to give the user an immersive feel of a virtual world. Applications of virtual reality include entertainment (particularly video gam ...
and
augmented reality
Augmented reality (AR), also known as mixed reality (MR), is a technology that overlays real-time 3D computer graphics, 3D-rendered computer graphics onto a portion of the real world through a display, such as a handheld device or head-mounted ...
,
smart cameras, and integrating machine vision acceleration into
smartphone
A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s and other
mobile devices
A mobile device or handheld device is a computer small enough to hold and operate in hand. Mobile devices are typically battery-powered and possess a flat-panel display and one or more built-in input devices, such as a touchscreen or keypad. Mod ...
.
Examples
*
Movidius Myriad X, which is the third-generation vision processing unit in the Myriad VPU line from
Intel Corporation
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Intel designs, manufactures, and sells computer components such as central processing ...
.
*
Movidius Myriad 2, which finds use in
Google Project Tango,
Google Clips and DJI drones
*
Pixel Visual Core (PVC), which is a fully programmable
Image
An image or picture is a visual representation. An image can be Two-dimensional space, two-dimensional, such as a drawing, painting, or photograph, or Three-dimensional space, three-dimensional, such as a carving or sculpture. Images may be di ...
, Vision and
AI processor for mobile devices
*
Microsoft HoloLens
Microsoft HoloLens is an augmented reality (AR)/ mixed reality (MR) headset developed and manufactured by Microsoft. HoloLens runs the Windows Mixed Reality platform under the Windows 10 operating system. Some of the positional tracking tech ...
, which includes an accelerator referred to as a ''holographic processing unit'' (complementary to its CPU and GPU), aimed at interpreting camera inputs, to accelerate environment tracking and vision for augmented reality applications.
*
Eyeriss, a design from
MIT
The Massachusetts Institute of Technology (MIT) is a private research university in Cambridge, Massachusetts, United States. Established in 1861, MIT has played a significant role in the development of many areas of modern technology and sc ...
intended for running
convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
s.
*
NeuFlow, a design by
Yann LeCun
Yann André Le Cun ( , ; usually spelled LeCun; born 8 July 1960) is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Pr ...
(implemented in
FPGA
A field-programmable gate array (FPGA) is a type of configurable integrated circuit that can be repeatedly programmed after manufacturing. FPGAs are a subset of logic devices referred to as programmable logic devices (PLDs). They consist of a ...
) for accelerating
convolutions, using a dataflow architecture.
*
Mobileye EyeQ, by
Mobileye
Mobileye Global Inc. is a United States- domiciled, Israel-headquartered autonomous driving company. It is developing self-driving technologies and advanced driver-assistance systems (ADAS) including cameras, computer chips, and software. Mobil ...
* Programmable Vision Accelerator (PVA), a
7-way VLIW Vision Processor designed by
Nvidia
Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
.
Broader category
Some processors are not described as VPUs, but are equally applicable to machine vision tasks. These may form a broader category of
AI accelerators
A neural processing unit (NPU), also known as AI accelerator or deep learning processor, is a class of specialized hardware acceleration, hardware accelerator or computer system designed to accelerate artificial intelligence (AI) and machine lear ...
(to which VPUs may also belong), however as of 2016 there is no consensus on the name:
*
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
TrueNorth
A cognitive computer is a computer that hardwires artificial intelligence and machine learning algorithms into an integrated circuit that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic engineering approach. ...
, a
neuromorphic processor aimed at similar sensor data
pattern recognition
Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
and intelligence tasks, including video/audio.
*
Qualcomm Zeroth Neural processing unit, another entry in the emerging class of sensor/AI oriented chips.
* All models of Intel
Meteor Lake processors have a
Versatile Processor Unit (VPU) built-in for accelerating
inference
Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...
for computer vision and deep learning.
See also
*
Adapteva Epiphany, a manycore processor with similar emphasis on on-chip dataflow, focussed on 32-bit floating point performance
*
CELL, a multicore processor with features fairly consistent with vision processing units (SIMD instructions & datatypes suitable for video, and on-chip DMA between scratchpad memories)
*
Coprocessor
A coprocessor is a computer processor used to supplement the functions of the primary processor (the CPU). Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or ...
*
Graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
, also commonly used to run vision algorithms. NVidia's Pascal architecture includes FP16 support, to provide a better precision/cost tradeoff for AI workloads
*
MPSoC
*
OpenCL
OpenCL (Open Computing Language) is a software framework, framework for writing programs that execute across heterogeneous computing, heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), di ...
*
OpenVX
*
Physics processing unit
A physics processing unit (PPU) is a dedicated microprocessor designed to handle the calculations of physics, especially in the physics engine of video games. It is an example of hardware acceleration.
Examples of calculations involving a PPU mi ...
, a past attempt to complement the CPU and GPU with a high throughput accelerator
*
Tensor Processing Unit, a chip used internally by Google for accelerating AI calculations
References
External links
Eyeriss architectureHolographic processing unitNeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision
{{Differentiable computing
Microprocessors
Neural processing units
Machine vision