In the field of video compression a

video frame In filmmaking, video production, animation, and related fields, a frame is one of the many '' still images'' which compose the complete '' moving picture''. The term is derived from the historical development of film stock, in which the sequenti ...

is compressed using different

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s with different advantages and disadvantages, centered mainly around amount of

data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...

. These different algorithms for video frames are called picture types or frame types. The three major picture types used in the different video algorithms are I, P and B. They are different in the following characteristics: * I‑frames are the least compressible but don't require other video frames to decode. * P‑frames can use data from previous frames to decompress and are more compressible than I‑frames. * B‑frames can use both previous and forward frames for data reference to get the highest amount of data compression.

Summary

Three types of ''pictures'' (or frames) are used in video compression: I, P, and B frames. An I‑frame ( Intra-coded picture) is a complete image, like a JPG or BMP image file. A P‑frame (Predicted picture) holds only the changes in the image from the previous frame. For example, in a scene where a car moves across a stationary background, only the car's movements need to be encoded. The encoder does not need to store the unchanging background pixels in the P‑frame, thus saving space. P‑frames are also known as ''delta‑frames''. A B‑frame (Bidirectional predicted picture) saves even more space by using differences between the current frame and both the preceding and following frames to specify its content. P and B frames are also called

Inter frame An inter frame is a frame in a video compression stream which is expressed in terms of one or more neighboring frames. The "inter" part of the term refers to the use of ''Inter frame prediction''. This kind of prediction tries to take advantage fro ...

s. The order in which the I, P and B frames are arranged is called the

Group of pictures In video coding, a group of pictures, or GOP structure, specifies the order in which intra- and inter-frames are arranged. The GOP is a collection of successive pictures within a coded video stream. Each coded video stream consists of successive ...

Pictures/frames

While the terms "frame" and "picture" are often used interchangeably, the term ''picture'' is a more general notion, as a picture can be either a frame or a

field Field may refer to: Expanses of open ground * Field (agriculture), an area of land used for agricultural purposes * Airfield, an aerodrome that lacks the infrastructure of an airport * Battlefield * Lawn, an area of mowed grass * Meadow, a grass ...

. A frame is a complete image, and a field is the set of odd-numbered or even-numbered

scan lines A scan line (also scanline) is one line, or row, in a raster scanning pattern, such as a line of video on a cathode ray tube (CRT) display of a television set or computer monitor. On CRT screens the horizontal scan lines are visually discernible, ...

composing a partial image. For example, an HD 1080 picture has 1080 lines (rows) of pixels. An odd field consists of pixel information for lines 1, 3, 5...1079. An even field has pixel information for lines 2, 4, 6...1080. When video is sent in interlaced-scan format, each frame is sent in two fields, the field of odd-numbered lines followed by the field of even-numbered lines. A frame used as a reference for predicting other frames is called a reference frame. Frames encoded without information from other frames are called I-frames. Frames that use prediction from a single preceding reference frame (or a single frame for prediction of each region) are called P-frames. B-frames use prediction from a (possibly weighted) average of two reference frames, one preceding and one succeeding.

Slices

In the H.264/MPEG-4 AVC standard, the granularity of prediction types is brought down to the "slice level." A slice is a spatially distinct region of a frame that is encoded separately from any other region in the same frame. I-slices, P-slices, and B-slices take the place of I, P, and B frames.

Macroblocks

Typically, pictures (frames) are segmented into '' macroblocks'', and individual prediction types can be selected on a macroblock basis rather than being the same for the entire picture, as follows: * I-frames can contain only intra macroblocks * P-frames can contain both intra macroblocks and predicted macroblocks * B-frames can contain intra, predicted, and bi-predicted macroblocks Furthermore, in the

H.264 Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distr ...

video coding standard, the frame can be segmented into sequences of macroblocks called ''slices'', and instead of using I, B and P-frame type selections, the encoder can choose the prediction style distinctly on each individual slice. Also in H.264 are found several additional types of frames/slices: * SI‑frames/slices (Switching I): Facilitates switching between coded streams; contains SI-macroblocks (a special type of intra coded macroblock). * SP‑frames/slices (Switching P): Facilitates switching between coded streams; contains P and/or I-macroblocks * Multi‑frame

motion estimation Motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions ...

(up to 16 reference frames or 32 reference fields) Multi‑frame motion estimation increases the quality of the video, while allowing the same compression ratio. SI and SP frames (defined for the Extended Profile) improve error correction. When such frames are used along with a smart decoder, it is possible to recover the broadcast streams of damaged DVDs.

Intra-coded (I) frames/slices (key frames)

* I-frames contain an entire image. They are coded without reference to any other frame except (parts of) themselves. * May be generated by an encoder to create a random access point (to allow a decoder to start decoding properly from scratch at that picture location). * May also be generated when differentiating image details prohibit generation of effective P or B-frames. * Typically require more bits to encode than other frame types. Often, I‑frames are used for random access and are used as references for the decoding of other pictures. Intra refresh periods of a half-second are common on such applications as

digital television Digital television (DTV) is the transmission of television signals using digital encoding, in contrast to the earlier analog television technology which used analog signals. At the time of its development it was considered an innovative adva ...

broadcast and

DVD The DVD (common abbreviation for Digital Video Disc or Digital Versatile Disc) is a digital optical disc data storage format. It was invented and developed in 1995 and first released on November 1, 1996, in Japan. The medium can store any kind ...

storage. Longer refresh periods may be used in some environments. For example, in

videoconferencing Videotelephony, also known as videoconferencing and video teleconferencing, is the two-way or multipoint reception and transmission of audio and video signals by people in different locations for real time communication.McGraw-Hill Concise Ency ...

systems it is common to send I-frames very infrequently.

Predicted (P) frames/slices

* Require the prior decoding of some other picture(s) in order to be decoded. * May contain both image data and motion vector displacements and combinations of the two. * Can reference previous pictures in decoding order. * Older standard designs (such as MPEG-2) use only one previously decoded picture as a reference during decoding, and require that picture to also precede the P picture in display order. * In H.264, can use multiple previously decoded pictures as references during decoding, and can have any arbitrary display-order relationship relative to the picture(s) used for its prediction. * Typically require fewer bits for encoding compared to pictures.

Bi-directional predicted (B) frames/slices (macroblocks)

* Require the prior decoding of subsequent frame(s) to be displayed. * May contain image data and/or motion vector displacements. Older standards allow only a single

global motion compensation {{refimprove, date=September 2008 ''Global motion compensation'' ''(GMC)'' is a motion compensation technique used in video compression to reduce the bitrate required to encode video. It is most commonly used in MPEG-4 ASP, such as with the Div ...

vector for the entire frame or a single motion compensation vector per macroblock. * Include some prediction modes that form a prediction of a motion region (e.g., a macroblock or a smaller area) by averaging the predictions obtained using two different previously decoded reference regions. Some standards allow two motion compensation vectors per macroblock (biprediction). * In older standards (such as MPEG-2), B-frames are never used as references for the prediction of other pictures. As a result, a lower quality encoding (requiring less space) can be used for such B-frames because the loss of detail will not harm the prediction quality for subsequent pictures. * H.264 relaxes this restriction, and allows B-frames to be used as references for the decoding of other frames at the encoder's discretion. * Older standards (such as MPEG-2), use exactly two previously decoded pictures as references during decoding, and require one of those pictures to precede the B-frame in display order and the other one to follow it. * H.264 allows for one, two, or more than two previously decoded pictures as references during decoding, and can have any arbitrary display-order relationship relative to the picture(s) used for its prediction. * The heightened flexibility of information retrieval means that B-frames typically require fewer bits for encoding than either I or P-frames.

References

External links

Video streaming with SP and SI frames
{{Compression Methods Video compression