HOME

TheInfoList



OR:

The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets. XMP standardizes a data model, a serialization format and core properties for the definition and processing of extensible metadata. It also provides guidelines for embedding XMP information into popular image, video and document file formats, such as
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and im ...
and PDF, without breaking their readability by applications that do not support XMP. Therefore, the non-XMP metadata have to be reconciled with the XMP properties. Although metadata can alternatively be stored in a sidecar file, embedding metadata avoids problems that occur when metadata is stored separately. The XMP data model, serialization format and core properties is published by the International Organization for Standardization as ISO 16684-1:2012 standard.


Data model

The defined XMP data model can be used to store any set of metadata properties. These can be simple name/value pairs, structured values or lists of values. The data can be nested as well. The XMP standard also defines particular namespaces for defined sets of core properties (e.g. a namespace for the
Dublin Core 220px, Logo image of DCMI, which formulates Dublin Core The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen "core" elements (properties) for describing resources. This fifteen-element Dublin Core has ...
Metadata Element Set). Custom namespaces can be used to extend the data model. An instance of the XMP data model is called an XMP packet. Adding properties to a packet does not affect existing properties. Software to add or modify properties in an XMP packet should leave properties that are unknown to it untouched. For example, it is useful for recording the history of a resource as it passes through multiple processing steps, from being photographed, scanned, or authored as text, through photo editing steps (such as cropping or color adjustment), to assemble into a final document. XMP allows each software program or device along the workflow to add its own information to a digital resource, which carries its metadata along. The prerequisite is that all involved editors either actively support XMP, or at least do not delete it from the resource.


Serialization

The abstract XMP data model needs a concrete representation when it is stored or embedded into a file. As serialization format, a subset of the W3C RDF/XML syntax is most commonly used. It is a syntax to express a
Resource Description Framework The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of ...
graph in XML. There are various equivalent ways to serialize the same XMP packet in RDF/XML. The most common metadata tags recorded in XMP data are those from the Dublin Core Metadata Initiative, which include things like title, description, creator, and so on. The standard is designed to be extensible, allowing users to add their own custom types of metadata into the XMP data. XMP generally does not allow binary data types to be embedded. This means that any binary data one wants to carry in XMP, such as thumbnail images, must be encoded in some XML-friendly format, such as Base64. XMP metadata can describe a document as a whole (the "main" metadata), but can also describe parts of a document, such as pages or included images. This architecture makes it possible to retain authorship and rights information about, for example, images included in a published document. Similarly, it permits documents created from several smaller documents to retain the original metadata associated with the parts.


Example

This is an example XML document for serialized XMP metadata in a JPEG photo: Picasa 912 687 pixel 0.680921052631579 0.3537117903930131 0.4264919941775837 0.32127192982456143 normalized 912 687 0220 This metadata describes various properties of the image like the creator tool, image dimension or a face region within the image.


Embedding

Embedding metadata in files allows easy sharing and transfer of files across products, vendors, platforms, without metadata getting lost. Embedding avoids a multitude of problems coming from proprietary vendor-specific metadata databases. XMP can be used in several file formats such as PDF,
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and im ...
,
JPEG 2000 JPEG 2000 (JP2) is an image compression standard and coding system. It was developed from 1997 to 2000 by a Joint Photographic Experts Group committee chaired by Touradj Ebrahimi (later the JPEG president), with the intention of superseding th ...
, JPEG XR, GIF, PNG, WebP,
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
, TIFF, Adobe Illustrator, PSD,
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Orig ...
, MP4,
Audio Video Interleave Audio Video Interleave (also Audio Video Interleaved and known by its initials and filename extension AVI, usually pronounced ), is a proprietary multimedia container format and Windows standard introduced by Microsoft in November 1992 as part ...
, WAV, RF64, Audio Interchange File Format,
PostScript PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, ...
, Encapsulated PostScript, and proposed for DjVu. In a typical edited
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and im ...
file, XMP information is typically included alongside
Exif Exchangeable image file format (officially Exif, according to JEIDA/JEITA/CIPA specifications) is a standard that specifies formats for images, sound, and ancillary tags used by digital cameras (including smartphones), scanners and other syste ...
and IPTC Information Interchange Model data.


Location in file types

''For more details, th
XMP Specification, Part 3 Storage in Files
listed below has details on embedding in specific file formats.'' * TIFFTag 700 *
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and im ...
Application segment 1 (0xFFE1) with segment header "http://ns.adobe.com/xap/1.0/\x00" *
JPEG 2000 JPEG 2000 (JP2) is an image compression standard and coding system. It was developed from 1997 to 2000 by a Joint Photographic Experts Group committee chaired by Touradj Ebrahimi (later the JPEG president), with the intention of superseding th ...
"uuid" atom with UID of 0xBE7ACFCB97A942E89C71999491E3AFAC * PNGinside an "iTXt" text block with the keyword "XML:com.adobe.xmp" * GIFas an Application Extension with identifier "XMP Data" and authentication code "XMP" *
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Orig ...
inside the ID3 block as a "PRIV" frame with an owner identifier of "XMP". * MP4top-level "UUID" box with the UUID 0xBE7ACFCB97A942E89C71999491E3AFAC (Same as JPEG 2000) * MOV (QuickTime)"XMP_" atom within a "udta" atom, within a top level "moov" atom. * PDFembedded in a metadata stream contained in a PDF object * WebPinside the files XMP chunk * For file formats that have no support for embedded XMP data, this data can be stored in external .xmp sidecar files.


Support and acceptance


XMP Toolkit

The XMP Toolkit implements metadata handling in two libraries: * XMPCore for creation and manipulation of metadata that follows the XMP Data Model. * XMPFiles for embedding serialized metadata in files, and for retrieving embedded metadata. Adobe provides the XMP Toolkit free of charge under a
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD li ...
. The Toolkit includes specification and usage documents (PDFs), API documentation ( doxygen/
javadoc Javadoc (originally cased JavaDoc) is a documentation generator created by Sun Microsystems for the Java language (now owned by Oracle Corporation) for generating API documentation in HTML format from Java source code. The HTML format is used ...
), C++ source code (XMPCore and XMPFiles) and
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
source code (currently only XMPCore). XMPFiles is currently available as a C++/Java implementation in Windows, Mac OS,
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
/
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
.


Free software and open-source tools (read/write support)

* Alfresco - open source CMS, DAM component can read/write XMP (Microsoft Windows, Linux) *