Stream Processors, Inc
   HOME

TheInfoList



OR:

Stream Processors, Inc was a
Silicon Valley Silicon Valley is a region in Northern California that serves as a global center for high technology and innovation. Located in the southern part of the San Francisco Bay Area, it corresponds roughly to the geographical areas San Mateo County ...
-based
fabless semiconductor company Fabless manufacturing is the design and sale of hardware devices and semiconductor chips while outsourcing their fabrication (or ''fab'') to a specialized manufacturer called a semiconductor foundry. These foundries are typically, but not exclus ...
specializing in the design and manufacture of high-performance
digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio si ...
s for applications including video surveillance, multi-function printers and video conferencing. The company ceased operations in 2009.


Company history

Foundational work in
stream processing In computer science, stream processing (also known as event stream processing, data stream processing, or distributed stream processing) is a programming paradigm which views data streams, or sequences of events in time, as the central input and ou ...
was initiated in 1995 by a research team led by
MIT The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the mo ...
professor
Bill Dally William James Dally (born August 17, 1960) is an American computer scientist and educator. Since 2021, he has been a member of the President’s Council of Advisors on Science and Technology (PCAST). Microelectronics He developed a number of t ...
. In 1996, he moved to
Stanford University Stanford University, officially Leland Stanford Junior University, is a private research university in Stanford, California. The campus occupies , among the largest in the United States, and enrolls over 17,000 students. Stanford is consider ...
where he continued this work, receiving a multimillion-dollar grant from
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
with additional resources from
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
and
Texas Instruments Texas Instruments Incorporated (TI) is an American technology company headquartered in Dallas, Texas, that designs and manufactures semiconductors and various integrated circuits, which it sells to electronics designers and manufacturers globall ...
to fund the development of a project called "Imagine" - the first stream processor chip and accompanying compiler tools.


The Imagine Project

The goal of the Imagine project was to develop a C programmable signal and image processor intended to provide both the performance density and efficiency of a special-purpose processor (such as a hard-wired
ASIC An application-specific integrated circuit (ASIC ) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficien ...
). The project successfully demonstrated the advantages of stream processing. Details on the Imagine project and its results are posted o
the Stanford Imagine project page
The work also showed that a number of applications ranging from wireless baseband processing, 3D graphics, encryption, IP forwarding to video processing could take advantage of the efficiency of stream processing. This research inspired other designs such as
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobil ...
s from
ATI Technologies ATI Technologies Inc. (commonly called ATI) was a Canadian semiconductor industry, semiconductor technology corporation based in Markham, Ontario, Markham, Ontario, that specialized in the development of graphics processing units and chipsets. Fo ...
as well as the
Cell microprocessor Cell is a multi-core microprocessor microarchitecture that combines a general-purpose PowerPC core of modest performance with streamlined coprocessing elements which greatly accelerate multimedia and vector processing applications, as well as m ...
from
Sony , commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
,
Toshiba , commonly known as Toshiba and stylized as TOSHIBA, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. Its diversified products and services include power, industrial and social infrastructure system ...
, and IBM. The main deliverables from the Imagine program included: * The Imagine Stream Architecture * The Stream programming model * Software development tools * Programmable graphics and real-time media applications * VLSI prototype (fabricated by TI) * Stream processor development platform (a prototype development board)


SPI established

Dally, together with other team members, obtained a license from Stanford to commercialize the resulting technology. Stream Processors, Incorporated (SPI) was incorporated in California in 2004. Professor Dally remained at Stanford and the company hired industry veteran Chip Stearn

to become the President and CEO in December of that year. Through June, 2006 SPI has been able to raise a total of $26M from a trio of notable
venture capital Venture capital (often abbreviated as VC) is a form of private equity financing that is provided by venture capital firms or funds to startups, early-stage, and emerging companies that have been deemed to have high growth potential or which ha ...
firms -
Austin Ventures Austin Ventures (AV) is a private equity firm focused on venture capital and growth equity investments in business services and supply chain, financial services, new media, Internet, and information services companies nationally with a focus on T ...
,
Norwest Venture Partners Norwest Venture Partners (Norwest) is an American venture and growth equity investment firm. The firm targets early to late-stage venture and growth equity investments across several sectors, including cloud computing and information technology, ...
and the Woodside Fund. The company launched its first two products concurrently with the International Solid State Circuits Conference (
ISSCC International Solid-State Circuits Conference is a global forum for presentation of advances in solid-state circuits and Systems-on-a-Chip. The conference is held every year in February at the San Francisco Marriott Marquis in downtown San Fra ...
) in February, 2006 and has introduced two others since. SPI has headquarters located in
Sunnyvale, California Sunnyvale () is a city located in the Santa Clara Valley in northwest Santa Clara County in the U.S. state of California. Sunnyvale lies along the historic El Camino Real and Highway 101 and is bordered by portions of San Jose to the nort ...
as well as a software development group (SPI Software Technologies Pvt. Ltd) located in
Bangalore, India Bangalore (), officially Bengaluru (), is the capital and largest city of the Indian state of Karnataka. It has a population of more than and a metropolitan population of around , making it the third most populous city and fifth most ...
. In January 2009 Co-Founder Prof.
Bill Dally William James Dally (born August 17, 1960) is an American computer scientist and educator. Since 2021, he has been a member of the President’s Council of Advisors on Science and Technology (PCAST). Microelectronics He developed a number of t ...
accepted a position as Chief Scientist with
NVIDIA Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
Corporation. At the same time he resigned as chairman. In an interview Dally reflected on his experiences with startups: " I have done several chip startups myself. It’s getting hard. The ante is very high. If you do a chip startup, you need patient investors with very deep pockets. It’s many tens of millions of dollars to get to a first product and $50 million to get to profits. That’s very difficult to do because investors want an exit some multiple over that investment. I am hoping we return to the days of frequent IPOs and get beyond the fire-sale acquisitions. That’s not what you can see right now. If it’s a programmable chip, the cost is even more." In the summer of 2009 CEO Stearns left the company and was replaced by Mike Fister, an executive with senior level experience at
Cadence Design Systems Cadence Design Systems, Inc. (stylized as cādence), headquartered in San Jose, California, is an American multinational corporation, multinational computational software company, founded in 1988 by the merger of SDA Systems and ECAD, Inc. The co ...
and
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
. In September 2009 the company ceased operations.http://sanjose.bizjournals.com/sanjose/stories/2009/11/02/daily124.html


Technology

Similar to graphics and scientific computing, media and signal processing are characterized by available data-parallelism, locality and a high computation to global memory access ratio.
Stream processing In computer science, stream processing (also known as event stream processing, data stream processing, or distributed stream processing) is a programming paradigm which views data streams, or sequences of events in time, as the central input and ou ...
exploits these characteristics using data-parallel processing fed by a distributed memory hierarchy managed by the compiler. The main challenge for next generation massively parallel processors is data bandwidth, not computational resources. Unlike most conventional processors, the technology does not rely on a hardware cache - instead data movement is explicitly managed by the compiler and hardware. The execution model is based on accelerating performance-critical functions (kernels) that process and produce data records (streams). Kernels and streams are scheduled at compile-time and moved to on-chip memory at runtime via a scoreboard. The compiler analyses data live times of streams to optimize allocation and minimize external memory bandwidth needs. Streams and kernels loads can overlap with execution to improve latency tolerance and the explicit data movement provides predictable performance. There are no
CPU cache A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which ...
misses and the design presents a single-core model to the programmer – data-parallelism is within the kernels.


Architecture

The architecture includes a host CPU (System MIPS) for system-level tasks and a DSP Coprocessor Subsystem where the DSP MIPS runs the main threads that make kernel function calls to the Data Parallel Unit (DPU). For users that use libraries, and don't intend to develop DSP code, the architecture is a MIPS-based
system-on-a-chip A system on a chip or system-on-chip (SoC ; pl. ''SoCs'' ) is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include a central processing unit (CPU), memory ...
with an
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standa ...
to a “black box”
coprocessor A coprocessor is a computer processor used to supplement the functions of the primary processor (the CPU). Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography o ...
. The DPU Dispatcher receives kernel function calls to manage runtime kernel and stream loads. One kernel at a time is executed across the lanes, operating on local stream data stored in the Lane Register File of each lane. Each lane has a set of
VLIW Very long instruction word (VLIW) refers to instruction set architectures designed to exploit instruction level parallelism (ILP). Whereas conventional central processing units (CPU, processor) mostly allow programs to specify instructions to exe ...
ALUs and distributed operand register files (ORF) allow for a large working data set and processing bandwidth exceeding 1 TeraByte/s. The Stream Load/Store Unit provides gather/scatter with a wide variety of access patterns. The InterLane Switch is a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
-scheduled, full crossbar for high-speed access between lanes.


Tools

SPI's RapiDev Tools Suite leverages the predictability of
stream processing In computer science, stream processing (also known as event stream processing, data stream processing, or distributed stream processing) is a programming paradigm which views data streams, or sequences of events in time, as the central input and ou ...
to provide a fast path to optimized results using
C programming C (''pronounced like the letter c'') is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of ...
. Starting with C reference code, the Fast Functional Debugger (FFD) library plugs into standard tools, such as Microsoft Visual Studio and GNU, and simulates the DPU to support re-structuring code to kernels and streams. Because kernels are statically scheduled and data movement is explicit, DPU cycle-accuracy can be obtained even at this functional high level. This is one source of the predictability of the architecture. For targeting code to the device, the Stream Processor Compiler (SPC) generates the VLIW executable and pre-processed C code that is compiled/linked via standard GCC for MIPS. SPC allocates streams in the Lane Register Files and provides dependency information for the kernel function calls. Software pipelining and
loop unrolling Loop unrolling, also known as loop unwinding, is a loop transformation technique that attempts to optimize a program's execution speed at the expense of its binary size, which is an approach known as space–time tradeoff. The transformation ca ...
are supported. Branch penalties are avoided by predicated selects and larger conditionals use conditional streams. Running under Eclipse, the Target Code Simulator provides comprehensive Host or Device binary code simulation with breakpoint and single-stepping capabilities with bandwidth and load statistics. A kernel view shows the VLIW pipeline for kernel optimizations, and a stream view shows kernel execution and stream loads to review global data movement for system profiling.


Products

SPI currently markets its Storm-1 family, that includes four fully software programmable DSPs of varying performance levels. Note: GMACS stands for Giga (billions of) Multiply-Accumulate operations per Second, a common measure of DSP performance.


Support hardware and software

* The RapiDev tools suite delivers a fast, predictable path to optimized results, eliminating the complexities of assembly coding or manual cache management * The Storm-1 DevKit is a PCI-based software development platform * IP Camera Reference Design runs standard Linux 2.6 and supports multiple simultaneous codecs (e.g.
H.264 Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distri ...
,
MPEG-4 MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related tec ...
and
MJPEG Motion JPEG (M-JPEG or MJPEG) is a video compression format in which each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image. Originally developed for multimedia PC applications, Motion JPEG e ...
), arbitrary resolutions,
CMOS Complementary metal–oxide–semiconductor (CMOS, pronounced "sea-moss", ) is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) fabrication process that uses complementary and symmetrical pairs of p-type and n-type MOSFE ...
and CCD sensor processing as well as video analytics in a fully software programmable platform * Video Streamer Reference Design supports eight 4CIF input channels of video compressed to
H.264 Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distri ...
and a
Gigabit Ethernet In computer networking, Gigabit Ethernet (GbE or 1 GigE) is the term applied to transmitting Ethernet frames at a rate of a gigabit per second. The most popular variant, 1000BASE-T, is defined by the IEEE 802.3ab standard. It came into use i ...
output


References


External links


The Imagine Project (Stanford) website
{{Coord, 37, 22, 59.48, N, 122, 04, 42.08, W, type:landmark_region:US-CA, display=title Fabless semiconductor companies Electronics companies established in 2004 Companies based in Sunnyvale, California Defunct semiconductor companies of the United States