InfiniBand (IB) is a computer networking communications standard used in
high-performance computing
High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems.
Overview
HPC integrates systems administration (including network and security knowledge) and parallel programming into a mult ...
that features very high
throughput
Network throughput (or just throughput, when in context) refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered ove ...
and very low
latency. It is used for data interconnect both among and within computers. InfiniBand is also used as either a direct or switched interconnect between servers and storage systems, as well as an interconnect between storage systems. It is designed to be
scalable and uses a
switched fabric network topology
Network topology is the arrangement of the elements ( links, nodes, etc.) of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and contr ...
.
By 2014, it was the most commonly used interconnect in the
TOP500
The TOP500 project ranks and details the 500 most powerful non- distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coinci ...
list of supercomputers, until about 2016.
Mellanox
Mellanox Technologies Ltd. ( he, מלאנוקס טכנולוגיות בע"מ) was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, softwa ...
(acquired by
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
) manufactures InfiniBand
host bus adapter
In computer hardware, a host controller, host adapter, or host bus adapter (HBA), connects a computer system bus, which acts as the host system, to other network and storage devices. The terms are primarily used to refer to devices for conne ...
s and
network switch
A network switch (also called switching hub, bridging hub, and, by the IEEE, MAC bridge) is networking hardware that connects devices on a computer network by using packet switching to receive and forward data to the destination device.
A netw ...
es, which are used by large computer system and database vendors in their product lines.
As a computer cluster interconnect, IB competes with
Ethernet
Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 1 ...
,
Fibre Channel, and Intel
Omni-Path
Omni-Path Architecture (OPA) was a high-performance communication architecture owned by Intel. It aims for low communication latency, low power consumption and a high throughput. Intel planned to develop technology based on this architecture for e ...
. The technology is promoted by the InfiniBand Trade Association.
History
InfiniBand originated in 1999 from the merger of two competing designs: Future I/O and Next Generation I/O (NGIO). NGIO was led by
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
, with a specification released on 1998, and joined by
Sun Microsystems and
Dell.
Future I/O was backed by
Compaq
Compaq Computer Corporation (sometimes abbreviated to CQ prior to a 2007 rebranding) was an American information technology company founded in 1982 that developed, sold, and supported computers and related products and services. Compaq produced ...
,
IBM, and
Hewlett-Packard.
This led to the formation of the InfiniBand Trade Association (IBTA), which included both sets of hardware vendors as well as software vendors such as
Microsoft
Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washin ...
.
At the time it was thought some of the more powerful computers were approaching the
interconnect bottleneck
The interconnect bottleneck comprises limits on integrated circuit (IC) performance due to connections between components instead of their internal speed.
In 2006 it was predicted to be a "looming crisis" by 2010.
Improved performance of compute ...
of the
PCI bus, in spite of upgrades like
PCI-X
PCI-X, short for Peripheral Component Interconnect eXtended, is a computer bus and expansion card standard that enhances the 32-bit PCI local bus for higher bandwidth demanded mostly by servers and workstations. It uses a modified protocol t ...
.
Version 1.0 of the InfiniBand Architecture Specification was released in 2000. Initially the IBTA vision for IB was simultaneously a replacement for PCI in I/O, Ethernet in the
machine room,
cluster
may refer to:
Science and technology Astronomy
* Cluster (spacecraft), constellation of four European Space Agency spacecraft
* Asteroid cluster, a small asteroid family
* Cluster II (spacecraft), a European Space Agency mission to study t ...
interconnect and
Fibre Channel. IBTA also envisaged decomposing server hardware on an IB
fabric
Textile is an umbrella term that includes various fiber-based materials, including fibers, yarns, filaments, threads, different fabric types, etc. At first, the word "textiles" only referred to woven fabrics. However, weaving is not th ...
.
Mellanox
Mellanox Technologies Ltd. ( he, מלאנוקס טכנולוגיות בע"מ) was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, softwa ...
had been founded in 1999 to develop NGIO technology, but by 2001 shipped an InfiniBand product line called InfiniBridge at 10 Gbit/second speeds.
Following the burst of the
dot-com bubble
The dot-com bubble (dot-com boom, tech bubble, or the Internet bubble) was a stock market bubble in the late 1990s, a period of massive growth in the use and adoption of the Internet.
Between 1995 and its peak in March 2000, the Nasdaq Compo ...
there was hesitation in the industry to invest in such a far-reaching technology jump.
By 2002, Intel announced that instead of shipping IB integrated circuits ("chips"), it would focus on developing
PCI Express, and Microsoft discontinued IB development in favor of extending Ethernet. Sun and
Hitachi continued to support IB.
In 2003, the
System X supercomputer built at
Virginia Tech
Virginia Tech (formally the Virginia Polytechnic Institute and State University and informally VT, or VPI) is a public land-grant research university with its main campus in Blacksburg, Virginia. It also has educational facilities in six re ...
used InfiniBand in what was estimated to be the third largest computer in the world at the time.
The
OpenIB Alliance (later renamed OpenFabrics Alliance) was founded in 2004 to develop an open set of software for the
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
kernel. By February, 2005, the support was accepted into the 2.6.11 Linux kernel.
In November 2005 storage devices finally were released using InfiniBand from vendors such as Engenio.
Of the top 500 supercomputers in 2009,
Gigabit Ethernet
In computer networking, Gigabit Ethernet (GbE or 1 GigE) is the term applied to transmitting Ethernet frames at a rate of a gigabit per second. The most popular variant, 1000BASE-T, is defined by the IEEE 802.3ab standard. It came into use ...
was the internal interconnect technology in 259 installations, compared with 181 using InfiniBand.
In 2010, market leaders Mellanox and Voltaire merged, leaving just one other IB vendor,
QLogic, primarily a
Fibre Channel vendor.
At the 2011
International Supercomputing Conference
The ISC High Performance, formerly known as the International Supercomputing Conference, is a yearly conference on supercomputing which has been held in Europe since 1986. It stands as the oldest supercomputing conference in the world.
History ...
, links running at about 56 gigabits per second (known as FDR, see below), were announced and demonstrated by connecting booths in the trade show.
In 2012, Intel acquired QLogic's InfiniBand technology, leaving only one independent supplier.
By 2014, InfiniBand was the most popular internal connection technology for supercomputers, although within two years,
10 Gigabit Ethernet
10 Gigabit Ethernet (10GE, 10GbE, or 10 GigE) is a group of computer networking technologies for transmitting Ethernet frames at a rate of 10 gigabits per second. It was first defined by the IEEE 802.3ae-2002 standard. Unlike previous ...
started displacing it.
In 2016, it was reported that
Oracle Corporation (an investor in Mellanox) might engineer its own InfiniBand hardware.
In 2019
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
acquired Mellanox, the last independent supplier of InfiniBand products.
Specification
Specifications are published by the InfiniBand trade association.
Performance
Original names for speeds were single-data rate (SDR), double-data rate (DDR) and quad-data rate (QDR) as given below.
Subsequently, other three-letter acronyms were added for even higher data rates.
Links can be aggregated: most systems use a 4 link/lane connector (QSFP). HDR often makes use of 2x links (aka HDR100, 100Gb link using 2 lanes of HDR, while still using a QSFP connector). 8x is called for with NDR switch ports using OSFP (Octal Small Form Factor Pluggable) connectors
InfiniBand provides
remote direct memory access (RDMA) capabilities for low CPU overhead.
Topology
InfiniBand uses a
switched fabric topology, as opposed to early shared medium
Ethernet
Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 1 ...
. All transmissions begin or end at a channel adapter. Each processor contains a host channel adapter (HCA) and each peripheral has a target channel adapter (TCA). These adapters can also exchange information for security or
quality of service
Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitat ...
(QoS).
Messages
InfiniBand transmits data in packets of up to 4 KB that are taken together to form a message. A message can be:
* a remote direct memory access read or write
* a
channel
Channel, channels, channeling, etc., may refer to:
Geography
* Channel (geography), in physical geography, a landform consisting of the outline (banks) of the path of a narrow body of water.
Australia
* Channel Country, region of outback Austral ...
send or receive
* a transaction-based operation (that can be reversed)
* a
multicast
In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused with ...
transmission
* an
atomic operation
In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of invocation and response events (event), that may be extended by adding response events such that:
# The extended list can be re-e ...
Physical interconnection
In addition to a board form factor connection, it can use both active and passive copper (up to 10 meters) and
optical fiber cable
A fiber-optic cable, also known as an optical-fiber cable, is an assembly similar to an electrical cable, but containing one or more optical fibers that are used to carry light. The optical fiber elements are typically individually coated with ...
(up to 10 km).
QSFP
Small Form-factor Pluggable connected to a pair of fiber-optic cables
Small Form-factor Pluggable (SFP) is a compact, hot-pluggable network interface module format used for both telecommunication and data communications applications. An SF ...
connectors are used.
The InfiniBand Association also specified the
CXP connector system for speeds up to 120 Gbit/s over copper, active optical cables, and optical transceivers using parallel multi-mode fiber cables with 24-fiber MPO connectors.
Software interfaces
Mellanox operating system support is available for
Solaris,
FreeBSD,
Red Hat Enterprise Linux
Red Hat Enterprise Linux (RHEL) is a commercial open-source Linux distribution developed by Red Hat for the commercial market. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop ...
,
SUSE Linux Enterprise Server (SLES),
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ser ...
,
HP-UX,
VMware ESX
VMware ESXi (formerly ESX) is an enterprise-class, type-1 hypervisor developed by VMware for deploying and serving virtual computers. As a type-1 hypervisor, ESXi is not a software application that is installed on an operating system (OS); ...
, and
AIX
Aix or AIX may refer to:
Computing
* AIX, a line of IBM computer operating systems
*An Alternate Index, for a Virtual Storage Access Method Key Sequenced Data Set
* Athens Internet Exchange, a European Internet exchange point
Places Belgi ...
.
InfiniBand has no specific standard
application programming interface (API). The standard only lists a set of verbs such as
ibv_open_device
or
ibv_post_send
, which are abstract representations of functions or methods that must exist. The syntax of these functions is left to the vendors. Sometimes for reference this is called the ''verbs'' API. The
de facto standard
A ''de facto'' standard is a custom or convention that has achieved a dominant position by public acceptance or market forces (for example, by early entrance to the market). is a Latin phrase (literally " in fact"), here meaning "in practice b ...
software is developed by
OpenFabrics Alliance
The OpenFabrics Alliance is a non-profit organization that promotes remote direct memory access (RDMA) switched fabric technologies for server and storage connectivity. These high-speed data-transport technologies are used in high-performance ...
and called the Open Fabrics Enterprise Distribution (OFED). It is released under two licenses
GPL2
The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general u ...
or
BSD license
BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
for Linux and FreeBSD, and as Mellanox OFED for Windows (product names: WinOF / WinOF-2; attributed as host controller driver for matching specific ConnectX 3 to 5 devices) under a choice of BSD license for Windows.
It has been adopted by most of the InfiniBand vendors, for
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
,
FreeBSD, and
Microsoft Windows.
IBM refers to a software library called
libverbs
, for its
AIX
Aix or AIX may refer to:
Computing
* AIX, a line of IBM computer operating systems
*An Alternate Index, for a Virtual Storage Access Method Key Sequenced Data Set
* Athens Internet Exchange, a European Internet exchange point
Places Belgi ...
operating system, as well as "AIX InfiniBand verbs".
The Linux kernel support was integrated in 2005 into the kernel version 2.6.11.
Ethernet over InfiniBand
Ethernet over InfiniBand, abbreviated to EoIB, is an Ethernet implementation over the InfiniBand protocol and connector technology.
EoIB enables multiple Ethernet bandwidths varying on the InfiniBand (IB) version.
Ethernet's implementation of the
Internet Protocol Suite
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the sui ...
, usually referred to as TCP/IP, is different in some details compared to the direct InfiniBand protocol in IP over IB (IPoIB).
See also
*
100 Gigabit Ethernet
40 Gigabit Ethernet (40GbE) and 100 Gigabit Ethernet (100GbE) are groups of computer networking technologies for transmitting Ethernet frames at rates of 40 and 100 gigabits per second (Gbit/s), respectively. These technologies offer significantly ...
*
iSCSI Extensions for RDMA
The iSCSI Extensions for RDMA (iSER) is a computer network protocol that extends the Internet Small Computer System Interface (iSCSI) protocol to use Remote Direct Memory Access ( RDMA). RDMA is provided by either the Transmission Control Protocol ...
*
iWARP
iWARP is a computer networking protocol that implements remote direct memory access (RDMA) for efficient data transfer over Internet Protocol networks. Contrary to some accounts, iWARP is not an acronym.
Because iWARP is layered on Internet Eng ...
*
List of interface bit rates
*
Optical communication
*
Parallel optical interface
*
SCSI RDMA Protocol In computing the SCSI RDMA Protocol (SRP) is a protocol that allows one computer to access SCSI devices attached to another computer via remote direct memory access (RDMA).
References
External links
*
InfiniBand Trade Association web site
{{Authority control
Serial buses
Computer buses
Supercomputing
Computer networks