CIF (''Common Intermediate Format'' or ''Common Interchange Format''), also known as FCIF (''Full Common Intermediate Format''), is a standardized format for the picture resolution,

frame rate Frame rate, most commonly expressed in frame/s, or FPS, is typically the frequency (rate) at which consecutive images (Film frame, frames) are captured or displayed. This definition applies to film and video cameras, computer animation, and moti ...

color space A color space is a specific organization of colors. In combination with color profiling supported by various physical devices, it supports reproducible representations of colorwhether such representation entails an analog or a digital represe ...

, and color subsampling of

digital video Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises ...

sequences used in video teleconferencing systems. It was first defined in the H.261 standard in 1988. CIF and D1 definitions comparison

As the word "common" in its name implies, CIF was designed as a common compromise format to be relatively easy to convert for use either with

PAL Phase Alternating Line (PAL) is a color encoding system for analog television. It was one of three major analogue colour television standards, the others being NTSC and SECAM. In most countries it was broadcast at 625 lines, 50 fields (25 ...

NTSC NTSC (from National Television System Committee) is the first American standard for analog television, published and adopted in 1941. In 1961, it was assigned the designation System M. It is also known as EIA standard 170. In 1953, a second ...

standard displays and cameras. CIF defines a video sequence with a resolution of 352 × 288, which has a simple relationship to the PAL picture size, but with a frame rate of 30000/1001 (roughly 29.97) frames per second like NTSC, with color encoded using a

YCbCr YCbCr, Y′CbCr, also written as YCBCR or Y′CBCR, is a family of color spaces used as a part of the color image pipeline in digital video and digital photography, photography systems. Like YPbPr, YPBPR, it is based on RGB primaries; the two ...

representation with 4:2:0 color sampling. It was designed as a compromise between PAL and NTSC schemes, since it uses a picture size that corresponds most easily to PAL, but uses the frame rate of NTSC. The compromise was established as a way to reach international agreement so that video conferencing systems in different countries could communicate with each other without needing two separate modes for displaying the received video.

Technical details

The simple way to convert NTSC video to CIF is to capture every other

field Field may refer to: Expanses of open ground * Field (agriculture), an area of land used for agricultural purposes * Airfield, an aerodrome that lacks the infrastructure of an airport * Battlefield * Lawn, an area of mowed grass * Meadow, a grass ...

(e.g., the top fields) of

interlaced video Interlaced video (also known as interlaced scan) is a technique for doubling the perceived frame rate of a video display without consuming extra Bandwidth (signal processing), bandwidth. The interlaced signal contains two field (video), fields ...

, downsample it by 2:1 horizontally to convert 704 samples per line to 352 samples per line, and upsample it vertically by a ratio of 6:5 vertically to convert 240 lines to 288 lines. The simple way to convert PAL video to CIF is to similarly capture every other field, downsample it horizontally by 2:1, and introduce some

jitter In electronics and telecommunications, jitter is the deviation from true periodicity of a presumably periodic signal, often in relation to a reference clock signal. In clock recovery applications it is called timing jitter. Jitter is a signifi ...

in the frame rate by skipping or repeating frames as necessary. Since H.261 systems typically operated at low

bit rate In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time. The bit rate is expressed in the unit bit per second (symbol: bit/s), often in conjunction ...

s, they also typically operated at low frame rates by skipping many of the camera source frames, so introducing some jitter in the frame rate tended not to be noticeable. More sophisticated conversion schemes (e.g., using

deinterlacing Deinterlacing is the process of converting interlaced video into a non-interlaced or Progressive scan, progressive form. Interlaced video signals are commonly found in analog television, VHS, Laserdisc, digital television (HDTV) when in the 1080 ...

to improve the vertical resolution from an NTSC camera) could also be used in higher quality systems. In contrast to the CIF compromise that originated with the H.261 standard, there are two variants of the SIF (''

Source Input Format Source Input Format (SIF) defined in MPEG-1, is a video format that was developed to allow the storage and transmission of digital video. * 625/50 SIF format (PAL/SECAM) has a resolution of active pixels (half of PAL ) [or active pixels ( ...

'') that was first defined in the

MPEG-1 MPEG-1 is a Technical standard, standard for lossy compression of video and Audio frequency, audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s (26:1 and 6:1 compression ratios respectively ...

standard. SIF is otherwise very similar to CIF. SIF on 525-line ("NTSC") based systems is 352 × 240 with a frame rate of 30000/1001 frames per second, and on 625-line ("PAL") based systems, it has the same picture size as CIF (352 × 288) but with a frame rate of 25 frames per second. Some references to CIF are intended to refer only to its ''resolution'' (352 × 288), without intending to refer to its frame rate. The YCbCr color representation had been previously defined in the first standard digital video source format,

CCIR 601 ITU-R Recommendation BT.601, more commonly known by the abbreviations Rec. 601 or BT.601 (or its former name CCIR 601), is a standard originally issued in 1982 by the Comité consultatif international pour la radio, CCIR (an organizati ...

, in 1982. However, CCIR 601 uses 4:2:2 color sampling, which subsamples the Cb and Cr components only horizontally. H.261 additionally used vertical color subsampling, resulting in what is known as 4:2:0. QCIF means "Quarter CIF". To have one quarter of the area, as "quarter" implies, the height and width of the frame are halved. Terms also used are SQCIF (Sub Quarter CIF, sometimes Sub-QCIF), SCIF (sometimes Sub-CIF ), 4CIF (4 × CIF), 9CIF (9 × CIF) and 16CIF (16 × CIF). The resolutions for all of these formats are summarized in the table below. xCIF

pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a Raster graphics, raster image, or the smallest addressable element in a dot matrix display device. In most digital display devices, p ...

s are not square, instead having a ″native″ aspect ratio ( pixel aspect ratio (PAR)) of 12:11 (PAR = DAR : SAR = : = ), as with the standard for 625-line systems (see

). On square-pixel displays (e.g., computer screens and many modern televisions) xCIF rasters should be rescaled so that the picture covers a 4:3 area, in order to avoid a "stretched" look: CIF content expanded horizontally by 12:11 results in a 4:3 raster of 384 × 288 square pixels (384 = 352 * 12/11). (This can happen on larger graphics displays of any aspect ratio in a

window A window is an opening in a wall, door, roof, or vehicle that allows the exchange of light and may also allow the passage of sound and sometimes air. Modern windows are usually glazed or covered in some other transparent or translucent ma ...

of square pixels or enlarged to full screen on any larger 4:3 graphic display.){{cn, reason=A citation is needed that there are graphics displays with a resolution of 384 × 288 pixels, date=May 2023 The CIF and QCIF picture dimensions were specifically chosen to be multiples of 16 because of the way that

discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequency, frequencies. The DCT, first proposed by Nasir Ahmed (engineer), Nasir Ahmed in 1972, is a widely ...

based video compression/decompression was handled in H.261, using 16 × 16

macroblock The macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform ...

s and 8 × 8 transform blocks. So a CIF-size image (352 × 288) contains 22 × 18 macroblocks and a QCIF image (176 × 144) contains 11 × 9 macroblocks. The 16 × 16 macroblock concept was later also used in other compression standards such as

MPEG-2 MPEG-2 (a.k.a. H.222/H.262 as was defined by the ITU) is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods ...

MPEG-4 Part 2 MPEG-4 Part 2, MPEG-4 Visual (formally International Organization for Standardization, ISO/International Electrotechnical Commission, IEC 14496-2) is a video encoding specification designed by the Moving Picture Experts Group (MPEG). It belongs to ...

, H.263, and H.264/MPEG-4 AVC.

References

ITU-T H.261 standard
Telecommunications standards Videotelephony

Technical details

See also

References