In
video coding
A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression algo ...
, a group of pictures, or GOP structure, specifies the order in which
intra- and
inter-frame
An inter frame is a frame in a video compression stream which is expressed in terms of one or more neighboring frames. The "inter" part of the term refers to the use of ''Inter frame prediction''. This kind of prediction tries to take advantage fr ...
s are arranged. The GOP is a collection of successive pictures within a coded video stream. Each coded video stream consists of successive GOPs, from which the visible frames are generated. Encountering a new GOP in a compressed video stream means that the decoder doesn't need any previous frames in order to decode the next ones, and allows fast seeking through the video.
Description
A GOP can contain the following picture types:
* I picture or
I frame (intra coded picture, also called keyframe or i-frame) – a picture that is coded independently of all other pictures. Each GOP begins (in decoding order) with this type of picture.
* P picture or
P frame (predictive coded picture) – contains
motion-compensated difference information relative to previously decoded pictures. In older designs such as
MPEG-1
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s (26:1 and 6:1 compression ratios respectively) without excessive quality loss, making ...
,
H.262
H.262 or MPEG-2 Part 2 (formally known as ITU-T Recommendation H.262 and ISO/IEC 13818-2, also known as MPEG-2 Video) is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and Inte ...
/
MPEG-2
MPEG-2 (a.k.a. H.222/H.262 as was defined by the ITU) is a standard for "the generic video coding format, coding of moving pictures and associated audio information". It describes a combination of Lossy compression, lossy video compression and ...
and
H.263, each P picture can only reference one picture, and that picture must precede the P picture in display order as well as in decoding order and must be an I or P picture. These constraints do not apply in the newer standards
H.264/MPEG-4 AVC and
HEVC
High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding (AVC, H.264, or MPEG-4 Part 10). In compari ...
.
* B picture or
B frame
In the field of video compression a video frame is compressed using different algorithms with different advantages and disadvantages, centered mainly around amount of data compression. These different algorithms for video frames are called pict ...
(bipredictive coded picture) – contains motion-compensated difference information relative to previously decoded pictures. In older designs such as MPEG-1 and H.262/MPEG-2, each B picture can only reference two pictures, the one which precedes the B picture in display order and the one which follows, and all referenced pictures must be I or P pictures. These constraints do not apply in newer standards
H.264/MPEG-4 AVC and
HEVC
High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding (AVC, H.264, or MPEG-4 Part 10). In compari ...
.
* D picture or
D frame
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s (26:1 and 6:1 compression ratios respectively) without excessive quality loss, making ...
(DC direct coded picture) – serves as a fast-access representation of a picture for loss robustness or fast-forward. D pictures are only used in
MPEG-1
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s (26:1 and 6:1 compression ratios respectively) without excessive quality loss, making ...
video.
An I frame indicates the beginning of a GOP. Afterwards several P and B frames follow. In older designs, the allowed ordering and referencing structure is relatively constrained.
The I frames contain the full image and do not require any additional information to reconstruct them. Typically, encoders use GOP structures that cause each I frame to be a "clean random access point," such that decoding can start cleanly on an I frame and any errors within the GOP structure are corrected after processing a correct I frame.
In the newer designs found in
H.264/MPEG-4 AVC and
HEVC
High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding (AVC, H.264, or MPEG-4 Part 10). In compari ...
, encoders have much more flexibility about referencing structures. They can use the same referencing structures as were previously used in older designs, or they can use more pictures as references and they can use more flexible ordering of the coding order relative to the display order. They are also allowed to use B pictures as references when coding other (B or P) pictures. This extra flexibility can improve compression efficiency, but it can cause propagation of errors if some data becomes lost or corrupted. One popular structure for use with the newer designs is the use of a hierarchy of B pictures. Hierarchical B pictures can provide very good compression efficiency and can also limit the propagation of errors, since the hierarchy can ensure that the number of pictures affected by any data corruption problem is strictly limited.
Generally, the more I frames the video stream has, the more editable it is. However, having more I frames substantially increases bit rate needed to code the video.
GOP Structure
The GOP structure is often referred by two numbers, for example, M=3, N=12. The first number tells the distance between two anchor frames (I or P). The second one tells the distance between two full images (I-frames): it is the GOP size.
[{{Cite web, url=https://help.apple.com/compressor/mac/4.0/en/compressor/usermanual/#chapter=18%26section=5, title=Compressor 4 User Manual] For the example M=3, N=12, the GOP structure is IBBPBBPBBPBBI. Instead of the M parameter the maximal count of B-frames between two consecutive anchor frames can be used.
For example, in a sequence with pattern IBBBBPBBBBPBBBBI, the GOP size (N value) is equal to 15 (length between two I frames) and distance between two anchor frames (M value) is 5 (length between I and P frames or length between two consecutive P Frames).
References
MPEG
Video compression