AES67
   HOME

TheInfoList



OR:

AES67 is a
technical standard A technical standard is an established norm or requirement for a repeatable technical task which is applied to a common and repeated use of rules, conditions, guidelines or characteristics for products or related processes and production methods, ...
for
audio over IP Audio over IP (AoIP) is the distribution of digital audio across an IP network such as the Internet. It is used increasingly to provide high-quality audio feeds over long distances. The application is also known as audio contribution over IP (ACI ...
and
audio over Ethernet In audio and broadcast engineering, Audio over Ethernet (sometimes AoE—not to be confused with ATA over Ethernet) is the use of an Ethernet-based network to distribute real-time digital audio. AoE replaces bulky snake cables or audio-specif ...
(AoE) interoperability. The standard was developed by the
Audio Engineering Society The Audio Engineering Society (AES) is a professional body for engineers, scientists, other individuals with an interest or involvement in the professional audio industry. The membership largely comprises engineers developing devices or products ...
and first published in September 2013. It is a
layer 3 In the seven-layer OSI model of computer networking, the network layer is layer 3. The network layer is responsible for packet forwarding including routing through intermediate routers. Functions The network layer provides the means of transfe ...
protocol suite based on existing standards and is designed to allow interoperability between various IP-based audio networking systems such as
RAVENNA Ravenna ( , , also ; rgn, Ravèna) is the capital city of the Province of Ravenna, in the Emilia-Romagna region of Northern Italy. It was the capital city of the Western Roman Empire from 408 until its collapse in 476. It then served as the cap ...
, Livewire,
Q-LAN Q-LAN is the audio over IP audio networking technology component of the Q-Sys platform from QSC Audio Products QSC is an American manufacturer of audio products including power amplifiers, loudspeakers, digital mixers and digital signal proces ...
and
Dante Dante Alighieri (; – 14 September 1321), probably baptized Durante di Alighiero degli Alighieri and often referred to as Dante (, ), was an Italian poet, writer and philosopher. His ''Divine Comedy'', originally called (modern Italian: '' ...
. AES67 promises interoperability between previously competing networked audio systems and long-term network interoperation between systems. It also provides interoperability with layer 2 technologies, like Audio Video Bridging (AVB). Since its publication, AES67 has been implemented independently by several manufacturers and adopted by many others.


Overview

AES67 defines requirements for synchronizing clocks, setting QoS priorities for media traffic, and initiating media streams with standard protocols from the
Internet protocol suite The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suit ...
. AES67 also defines audio sample format and sample rate, supported number of channels, as well as IP data packet size and latency/buffering requirements. The standard calls out several protocol options for device discovery but does not require any to be implemented.
Session Initiation Protocol The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telepho ...
is used for unicast connection management. No connection management protocol is defined for multicast connections.


Synchronization

AES67 uses IEEE 1588-2008
Precision Time Protocol The Precision Time Protocol (PTP) is a protocol used to synchronize clocks throughout a computer network. On a local area network, it achieves clock accuracy in the sub-microsecond range, making it suitable for measurement and control systems. ...
(PTPv2) for clock synchronisation. For standard networking equipment, AES67 defines configuration parameters for a "PTP profile for media applications", based on IEEE 1588 delay request-response sync and (optionally) peer-to-peer sync (IEEE 1588 Annexes J.3 and J4); event messages are encapsulated in IPv4 packets over UDP transport (IEEE 1588 Annex D). Some of the default parameters are adjusted, specifically, logSyncInterval and logMinDelayReqInterval are reduced to improve accuracy and startup time. Clock Grade 2 as defined in
AES11 The AES11 standard published by the Audio Engineering Society provides a systematic approach to the synchronization of digital audio signals. AES11 recommends using an AES3 signal to distribute audio clocks within a facility. In this application, ...
Digital Audio Reference Signal (DARS) is signaled with clockClass. Network equipment conforming to IEEE 1588-2008 uses default PTP profiles; for video streams, SMPTE 2059-2 PTP profile can be used. In AVB/TSN networks, synchronization is achieved with
IEEE 802.1AS The Precision Time Protocol (PTP) is a protocol used to synchronize clocks throughout a computer network. On a local area network, it achieves clock accuracy in the sub-microsecond range, making it suitable for measurement and control systems. ...
profile for Time-Sensitive Applications. The media clock is based on synchronized network time with an IEEE 1588 epoch (1 January 1970 00:00:00 TAI). Clock rates are fixed at audio sampling frequencies of 44,1 kHz, 48 kHz and 96 kHz (i.e. thousand samples per second). RTP transport works with a fixed time offset to network clock.


Transport

Media data is transported in IPv4 packets and attempts to avoid
IP fragmentation 400px, An example of the fragmentation of a protocol data unit in a given layer into smaller fragments. IP fragmentation is an Internet Protocol (IP) process that breaks packets into smaller pieces (fragments), so that the resulting pieces can ...
.
Real-time Transport Protocol The Real-time Transport Protocol (RTP) is a network protocol for delivering audio and video over IP networks. RTP is used in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applicati ...
with RTP Profile for Audio and Video (L24 and L16 formats) is used over UDP transport. RTP payload is limited to 1460 bytes, to prevent fragmentation with default Ethernet MTU of 1500 bytes (after subtracting IP/UDP/RTP overhead of 20+8+12=40 Bytes). Contributing source (CSRC) identifiers and TLS encryption are not supported. Time synchronization, media stream delivery, and discovery protocols may use
IP multicast IP multicast is a method of sending Internet Protocol (IP) datagrams to a group of interested receivers in a single transmission. It is the IP-specific form of multicast and is used for streaming media and other network applications. It uses spec ...
ing with
IGMP The Internet Group Management Protocol (IGMP) is a communications protocol used by hosts and adjacent routers on IPv4 networks to establish multicast group memberships. IGMP is an integral part of IP multicast and allows the network to direct m ...
v2 (optionally IGMPv3) negotiation. Each media stream is assigned a unique multicast address (in the range from 239.0.0.0 to 239.255.255.255); only one device can send to this address (many-to-many connections are not supported). To monitor keepalive status and allocate bandwidth, devices may use RTCP report interval, SIP session timers and OPTIONS ping, or ICMP Echo request (ping). AES67 uses
DiffServ Differentiated services or DiffServ is a computer networking architecture that specifies a mechanism for classifying and managing network traffic and providing quality of service (QoS) on modern IP networks. DiffServ can, for example, be used t ...
to set QoS traffic priorities in the Differentiated Services Code Point (DSCP) field of the IP packet. Three classes should be supported at a minimum: *''Announce, Sync, Follow_Up, Delay_Req, Delay_Resp, Pdelay_Req, Pdelay_Resp, Pdelay_Resp_Follow_Up'' 250 μs maximum delay may be required for time-critical applications to prevent drops of audio. To prioritize critical media streams in a large network, applications may use additional values in the Assured Forwarding class 4 with low-drop probability (AF41), typically implemented as a weighted round-robin queue. Clock traffic is assigned to the Expedited Forwarding (EF) class, which typically implements strict priority per-hop behavior (PHB). All other traffic is handled on a best effort basis with Default Forwarding. RTP Clock Source Signalling procedure is used to specify PTP domain and grandmaster ID for each media stream.


Audio encoding

Sample formats include 16-bit and 24-bit
Linear PCM Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amp ...
with 48 kHz sampling frequency, and optional 24-bit 96 kHz and 16-bit 44.1 kHz. Other RTP audio video formats may be supported. Multiple sample frequencies are optional. Devices may enforce a global sample frequency setting. Media packets are scheduled according to 'packet time' - transmission duration of a standard Ethernet packet. Packet time is negotiated by the stream source for each streaming session. Short packet times provide low latency and high transmission rate, but introduce high overhead and require high-performance equipment and links. Long packet times increase latencies and require more buffering. A range from 125 μs to 4 ms is defined, though it is recommended that devices shall adapt to packet time changes and/or determine packet time by analyzing RTP timestamps. Packet time determines RTP payload size according to a supported sample rate. 1 ms is required for all devices. Devices should support a minimum of 1 to 8 channels per stream. *MTU size restrictions limit a 96 kHz audio stream using 4-ms packet time to a single channel.


Latency

Network latency (''link offset'') is the time difference between the moment an audio stream enters the source (ingress time), marked by RTP timestamp in the media packet, and the moment it leaves the destination (egress time). Latency depends on packet time, propagation and queuing delays, packet processing overhead, and buffering in the destination device; thus minimum latency is the shortest packet time and network forwarding time, which can be less than 1 μs on a point-to-point Gigabit Ethernet link with minimum packet size, but in real-world networks could be twice the packet time. Small buffers decrease latency but may result in drops of audio when media data does not arrive on time. Unexpected changes to network conditions and jitter from packet encoding and processing may require longer buffering and therefore higher latency. Destinations are required to use a buffer of 3 times the packet time, though at least 20 times the packet time (or 20 ms if smaller) is recommended. Sources are required to maintain transmission with jitter of less than 17 packet times (or 17 ms if shorter), though 1 packet time (or 1 ms if shorter) is recommended.


Interoperability with AVB

AES67 may transport media streams as
IEEE 802.1BA Audio Video Bridging (AVB) is a common name for the set of technical standards which provide improved synchronization, low-latency, and reliability for network switch, switched Ethernet networks. AVB embodies the following technologies and stan ...
AVB time-sensitive traffic Classes A and B on supported networks, with guaranteed latency of 2 ms and 50 ms respectively. Reservation of bandwidth with the Stream Reservation Protocol (SRP) specifies the amount of traffic generated through a measurement interval of 125 μs and 250 μs respectively. Multicast IP addresses have to be used, though only with a single source, as AVB networks only support Ethernet multicast destination addressing in the range from 01:00:5e:00:00:00 to 01:00:5e:7f:ff:ff. An SRP talker advertise message shall be mapped as follows: Under both IEEE 1588-2008 and IEEE 802.1AS, a PTP clock can be designated as an ordinary clock (OC), boundary clock (BC) or transparent clock (TC), though 802.1AS transparent clocks also have some boundary clock capabilities. A device may implement one or more of these capabilities. OC may have as few as one port (network connection), while TC and BC must have two or more ports. BC and OC ports can work as a master (grandmaster) or a slave. An IEEE 1588 profile is associated with each port. TC can belong to multiple clock domains and profiles. These provisions make it possible to synchronize IEEE 802.1AS clocks to IEEE 1588-2008 clocks used by AES67.


Development history

The standard was developed by the
Audio Engineering Society The Audio Engineering Society (AES) is a professional body for engineers, scientists, other individuals with an interest or involvement in the professional audio industry. The membership largely comprises engineers developing devices or products ...
beginning at the end of 2010. The standard was initially published September 2013. A second printing which added a patent statement from Audinate was published in March 2014. The Media Networking Alliance was formed in October 2014 to promote adoption of AES67. In October 2014 a
plugfest A plugtest or plugfest is an event based on a certain technical standard where the designers of electronic equipment or software test the interoperability of their products or designs with those of other manufacturers. It could be literally plugging ...
was held to test interoperability achieved with AES67. A second plugfest was conducted in November 2015 and third in February 2017. An update to the standard including clarifications and error corrections was issued in September 2015. In May 2016, the AES published a report describing synchronization interoperability between AES67 and SMPTE 2059-2. In June 2016, AES67 audio transport enhanced by AVB/TSN clock synchronisation and bandwidth reservation was demonstrated at InfoComm 2016. In September 2017,
SMPTE The Society of Motion Picture and Television Engineers (SMPTE) (, rarely ), founded in 1916 as the Society of Motion Picture Engineers or SMPE, is a global professional association of engineers, technologists, and executives working in the m ...
published ST 2110, a standard for professional video over IP. uses AES67 as the transport for audio accompanying the video. In December 2017 the Media Networking Alliance merged with the Alliance for IP Media Solutions (AIMS) combining efforts to promote standards-based network transport for audio and video. In April 2018 AES67-2018 was published. The principal change in this revision is addition of a protocol implementation conformance statement (PICS). The AES Standards Committee and AES67 editor, Kevin Gross, were recipients of a
Technology & Engineering Emmy Award The Technology and Engineering Emmy Awards, or Technology and Engineering Emmys, are one of two sets of Emmy Awards that are presented for outstanding achievement in engineering development in the television industry. The Technology and Engineer ...
in 2019 for the ''development of synchronized multi-channel uncompressed audio transport over IP networks.''


Adoption

The standard has been implemented by
Lawo Lawo is an international company based in Rastatt, Germany, specializing in the manufacture of digital mixing consoles and other professional audio equipment. It was founded in 1970 by Peter Lawo, and is currently run by his son Philipp. The comp ...
, Axia, AMX (in SVSI devices), Wheatstone,Extron Electronics
Riedel,
Ross Video Ross Video Ltd is a privately held Canadian company that designs and manufactures equipment for live event and video production. Ross Video's headquarters and manufacturing operations are located in Iroquois, Ontario, Canada, while their R&D lab ...
, ALC NetworX, Audinate, Archwave, Digigram, Sonifex,
Yamaha Yamaha may refer to: * Yamaha Corporation, a Japanese company with a wide range of products and services, established in 1887. The company is the largest shareholder of Yamaha Motor Company (below). ** Yamaha Music Foundation, an organization estab ...
, QSC, Neutrik, Attero Tech, Merging Technologies, Gallery SIENNA,
Behringer Behringer is an audio equipment company founded by the Swiss engineer Uli Behringer on 25 January 1989, in Willich, Germany. Behringer was the 14th largest manufacturer of music products in 2007. Behringer is a worldwide, multinational grou ...
,
Tieline Tieline Technology has offices in Indianapolis in the United States (Tieline America LLC) and in Perth, Western Australia (Tieline Pty Ltd). The company has a wide and established distribution network throughout Europe, the Americas and Austral ...
and is supported by RAVENNA-enabled devices under its AES67 Operational Profile.


Shipping products

Over time this table will grow to become a resource for integration and compatibility between devices. The discovery methods supported by each device are critical for integration since the AES67 specification does not stipulate how this should be done, but instead provides a variety of options or suggestions. Also, AES67 specifies Multicast or Unicast but many AES67 devices only support Multicast.


References


External links


Media Networking AllianceAIMS Alliance
*
Open-source AES67 implementation (proposed)Open-source AES67 implementation for LinuxOpen-source AES67 Monitoring Software
* {{Digital audio and video protocols Audio network protocols Networking standards Audio engineering Audio Engineering Society standards