Pyramid vector quantization (PVQ) is a method used in audio and video

codec A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder. In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or ...

s to

quantize Quantization is the process of constraining an input from a continuous or otherwise large set of values (such as the real numbers) to a discrete set (such as the integers). The term ''quantization'' may refer to: Signal processing * Quantizatio ...

and transmit

unit vector In mathematics, a unit vector in a normed vector space is a vector (often a spatial vector) of length 1. A unit vector is often denoted by a lowercase letter with a circumflex, or "hat", as in \hat (pronounced "v-hat"). The term ''direction ve ...

s, i.e. vectors whose magnitudes are known to the decoder but whose directions are unknown. PVQ may also be used as part of a gain/shape quantization scheme, whereby the magnitude and direction of a vector are quantized separately from each other. PVQ was initially described in 1986 in the paper "A Pyramid Vector Quantizer" by Thomas R. Fischer. PVQ power projection

One caveat of PVQ is that it operates under the taxicab distance (L1-norm). Conversion to/from the more familiar

Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefore o ...

(L2-norm) is possible via

vector projection The vector projection of a vector on (or onto) a nonzero vector , sometimes denoted \operatorname_\mathbf \mathbf (also known as the vector component or vector resolution of in the direction of ), is the orthogonal projection of onto a straig ...

, though results in a less uniform distribution of quantization points (the poles of the Euclidean ''n''-sphere become denser than non-poles). No efficient algorithm for the ideal (i.e., uniform) vector quantization of the Euclidean ''n''-sphere is known as of 2010. This non-uniformity can be reduced by applying deformation like coordinate-wise power before projection, reducing mean-squared quantization error by ~10%. PVQ is used in the

CELT The Celts (, see pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-European peoples. "The Celts, an ancien ...

audio codec (now Opus) and the

Daala Daala is a video coding format under development by the Xiph.Org Foundation under the lead of Timothy B. Terriberry mainly sponsored by the Mozilla Corporation. Like Theora and Opus, Daala is available free of any royalties and its reference i ...

video codec.

Overview

As a form of

vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...

, PVQ defines a codebook of quantization points, each of which is assigned an

integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...

codeword from 0 to −1. The goal of the encoder is to find the codeword of the closest vector, which the decoder must decode back into a vector. The PVQ codebook consists of all -dimensional points

\vec

with integer-only coordinates whose absolute values sum to a constant (i.e. whose

L1-norm In mathematics, the spaces are function spaces defined using a natural generalization of the -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue , although according to the Bourbaki ...

equals ). In

set-builder notation In set theory and its applications to logic, mathematics, and computer science, set-builder notation is a mathematical notation for describing a set by enumerating its elements, or stating the properties that its members must satisfy. Definin ...

: :

S(N,K)=\left\

where

\left\,  \vec \right\, _1

denotes the L1-norm of

\vec

. As it stands, the set tesselates the surface of an -dimensional pyramid. If desired, we may reshape it into a sphere by "projecting" the points onto the sphere, i.e. by normalizing them: :

S_\text(N,K)=\left\

where

\left\,  \vec \right\, _2

denotes the

L2-norm In mathematics, a norm is a function from a real or complex vector space to the non-negative real numbers that behaves in certain ways like the distance from the origin: it commutes with scaling, obeys a form of the triangle inequality, and is z ...

\vec

. Increasing the parameter results in more quantization points, and hence typically yields a more "accurate" approximation of the original

\vec

at the cost of larger integer codewords that require more bits to transmit.

Example

Suppose we wish to quantize three-dimensional unit vectors using the parameter =2. Our codebook becomes: (0.707 =

\sqrt/2

rounded to 3 decimal places.) Now, suppose we wish to transmit the unit vector <0.592, −0.720, 0.362> (rounded here to 3 decimal places, for clarity). According to our codebook, the closest point we can pick is codeword 13 (<0.707, −0.707, 0.000>), located approximately 0.381 units away from our original point. Increasing the parameter results in a larger codebook, which typically increases the reconstruction accuracy. For example, based on the Python code below, =5 (codebook size: 102) yields an error of only 0.097 units, and =20 (codebook size: 1602) yields an error of only 0.042 units.

Python code

import itertools import math from typing import List, NamedTuple, Tuple class PVQEntry(NamedTuple): codeword: int point: Tuple nt, ... normalizedPoint: Tuple loat, ... def create_pvq_codebook(n: int, k: int) -> List VQEntry """ Naive algorithm to generate an n-dimensional PVQ codebook with k pulses. Runtime complexity: O(k**n) """ ret = [] for p in itertools.product(range(-k, k + 1), repeat=n): if sum(abs(x) for x in p)

k: norm = math.sqrt(sum(abs(x) ** 2 for x in p)) q = tuple(x / norm for x in p) ret.append(PVQEntry(len(ret), p, q)) return ret def search_pvq_codebook( codebook: List VQEntry p: Tuple loat, ...) -> Tuple VQEntry, float """ Naive algorithm to search the PVQ codebook. Returns the point in the codebook that's "closest" to p, according to the Euclidean distance.) """ ret = None min_dist = None for i in range(len(codebook)): q = codebook normalizedPoint dist = math.sqrt(sum(abs(q - p ** 2 for j in range(len(p)))) if min_dist is None or dist < min_dist: ret = codebook min_dist = dist return ret, min_dist def example(p: Tuple loat, ... k: int) -> None: n = len(p) codebook = create_pvq_codebook(n, k) print("Number of codebook entries: " + str(len(codebook))) entry, dist = search_pvq_codebook(codebook, p) print("Best entry: " + str(entry)) print("Distance: " + str(dist)) phi = 1.2 theta = 5.4 x = math.sin(phi) * math.cos(theta) y = math.sin(phi) * math.sin(theta) z = math.cos(phi) p = (x, y, z) example(p, 2) example(p, 5) example(p, 20)

Complexity

The PVQ codebook can be searched in

O(KN)

. Encoding and decoding can likewise be performed in

O(KN)

using

O(K+N)

memory. The codebook size obeys the recurrence :

V(N,K) = V(N-1,K) + V(N,K-1) + V(N-1,K-1)

with

V(N,0) = 1

for all

N \ge 0

and

V(0,K) = 0

for all

K \ne 0

. A closed-form solution is given by :

V(N,K) = 2N \cdot _2F_1 (1-K,1-N;2;2).

where

_2F_1

is the hypergeometric function.

References

{{Reflist Lossy compression algorithms

Overview

Example

Python code

Complexity

See also

References