The Open Packaging Conventions (OPC) is a container-file technology initially created by
Microsoft
Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
to store a combination of
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
and non-XML files that together form a single entity such as an
Open XML Paper Specification
Open XML Paper Specification (also referred to as OpenXPS) is an open specification for a page description language and a fixed-document format. Microsoft developed it as the XML Paper Specification (XPS). In June 2009, Ecma International adop ...
(OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.
Specifications
The OPC is specified in Part 2 of the
Office Open XML
Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial versi ...
standards
ISO/
IEC 29500:2008 and
ECMA-376.
[ISO/IEC 29500-2:2008 - Information technology -- Document description and processing languages -- Office Open XML File Formats -- Part 2: Open Packaging Conventions](_blank)
ISO
The ISO/IEC 29500-2:2008 specification and the second edition of ECMA-376 makes a normative reference to
PKWARE, Inc.
PKWARE, Inc. is an enterprise data protection software company that provides discovery, classification, masking and encryption solutions, along with data compression software, used by organizations in financial services, manufacturing, milita ...
's ''.ZIP File Format Specification'' version 6.2.0 (2004), and supplements it with a normative set of clarifications. Note: The older first edition of ECMA-376 makes an informative (''i.e.'', non-normative) reference to the newer PKWARE Inc's ".ZIP File Format Specification" version 6.2.1 (2005).
The ZIP format is not specified by any international standard but has widespread community and developer acceptance.
Microsoft submitted a draft in 2006 to the
Internet Engineering Task Force
The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster or requirements and ...
for a "pack"
URI Scheme
A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, conc ...
(
pack://
) to be used for URI references to OPC-based packages. The draft expired in 2009, the specified syntax is incompatible with the
Internet Standard
In computer network engineering, an Internet Standard is a normative specification of a technology or methodology applicable to the Internet. Internet Standards are created and published by the Internet Engineering Task Force (IETF). They allow ...
for URI schemes (STD 66, RFC 3986). The scheme is now listed as ''historical''.
The ISO 19165:1-2018 recommends the use of the Open Packaging Conventions to implement the Geospatial Package defined in the
Open Archival Information System
An Open Archival Information System (or OAIS) is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community. The OAIS model can ...
.
Usage
Both the
XML Paper Specification (XPS) and
Office Open XML
Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial versi ...
(OOXML) use Open Packaging Conventions (OPC), which provide a profile of the common
ZIP format. In addition to data and document content in XML markup, files in the ZIP package can include other text and binary files in formats such as
PNG,
BMP,
AVI,
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
,
RTF
RTF may refer to:
Organisations
* African Union Regional Task Force, the military operation of the RCI-LRA, 2011–2018.
* Radiodiffusion-Télévision Française, a broadcaster in France, 1949–1964
* Russian Tennis Federation, the national go ...
, or even an already packaged
ODF file. OPC also defines some naming conventions and an indirection method to allow position independence of binary and XML files in the ZIP archive.
OPC files can be opened using common ZIP utilities. OPC allow indirection,
chunking and
relative indirection.
File formats using the OPC
The OPC is the foundation technology for many new file formats:
Adventures in Packaging - Episode 1
May 18, 2009, by Jack Davis, Microsoft Packaging Team Blog: Open Packaging Conventions
Programming
OPC is natively supported in Microsoft .NET Framework
The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
3.0 by the namespace. Open source libraries exist for other languages.
Since Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was released to manufacturing on July 22, 2009, and became generally available on October 22, 2009. It is the successor to Windows Vista, released nearl ...
, OPC is also natively supported in the Windows API
The Windows API, informally WinAPI, is Microsoft's core set of application programming interfaces (APIs) available in the Microsoft Windows operating systems. The name Windows API collectively refers to several different platform implementations ...
through a set of COM interfaces, collectively referred to a
Packaging API
Alternatively, ZIP libraries can be used to create and open OPC files, as long as the correct files are included in the ZIP and the conventions followed.
Package, parts, and relationships
In OPC terminology, the term ''package'' corresponds to a ZIP archive and the term ''part'' corresponds to a file stored within the ZIP. Every part in a package has a unique URI-compliant part name along with a specified content-type expressed in the form of a MIME
Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs. Messa ...
media type
A media type (also known as a MIME type) is a two-part identifier for file formats and format contents transmitted on the Internet. The Internet Assigned Numbers Authority (IANA) is the official authority for the standardization and publication o ...
. A part's content-type explicitly defines the type of data stored in the part and reduces duplication and ambiguity issues inherent with file extensions.
OPC packages can also include ''relationships'' that define associations between the package, parts, and external resources. In addition to a hierarchy of directories and parts, OPC packages commonly use ''relationships'' to access content through a directed graph
In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs.
Definition
In formal terms, a directed graph is an ordered pai ...
of relationship associations. Relationships are composed of four elements:
:* an identifier (ID)
:* an optional source (the package or a part within the package)
:* a relationship type (a URI-style expression that defines the type of the relationship)
:* a target (a URI Uri may refer to:
Places
* Canton of Uri, a canton in Switzerland
* Úri, a village and commune in Hungary
* Uri, Iran, a village in East Azerbaijan Province
* Uri, Jammu and Kashmir, a town in India
* Uri (island), an island off Malakula Isla ...
to another part within the package or to an external resource)
OPC packages can store parts that contain any type of data (text, images, XML, binary, whatever). The extension ".rels", however, is reserved for storing relationships metadata within "/_rels" subfolders. The subfolder name "_rels", the file extension ".rels" within such directory, and the filename " ontent_Typesxml" in any folder are the only three reserved names for files stored in an OPC package.
:; / ontent_Typesxml file
:: This file defines the MIME
Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs. Messa ...
media types for all the parts stored in the package. The "/ ontent_Typesxml" file defines default mappings based on file extensions, along with overrides for specific parts with content-types that are different from the file extension defaults. For example, one of these defined MIME types is:
:::
:; /_rels
:: The root level "/_rels" folder stores the relationships for the package as a whole. The "/_rels" folder normally contains a file named ".rels". "/_rels/.rels" is an XML file where the starting ''package-level relationships'' are stored. Normally when opening an OPC-based file, applications start by accessing to the "/_rels/.rels" file to read the starting package-level relationships.
:; '' artname'.rels
:: Each part may have its own relationships. The ''_rels'' folders are where one goes to find the relationships for any given part within the package. To find the relationships for a specific part, one looks in the "_rels" folder that is a sibling of that part: If the part has relationships, the "_rels" folder will contain a file that has one's original part name with a ".rels" appended to it. For example, if the content types part file had any relationships, there would be a file called " ontent_Typesxml.rels" inside the "/_rels" folder.
All relationships (including the relations associated to the root package) are represented as XML files. If you open a ".rels" file in a text editor, you can view the actual XML markup that defines all the relationships targeted from that part. A typical relationships file contains XML code like this:
which defines two relations for the root package, the first one being considered as the root package (here for an early ''Microsoft XPS'' document, before it was standardized as Open XML Paper Specification
Open XML Paper Specification (also referred to as OpenXPS) is an open specification for a page description language and a fixed-document format. Microsoft developed it as the XML Paper Specification (XPS). In June 2009, Ecma International adop ...
within the openxmlformats collection), and the other one being used to reference an alternate form (here a thumbnail rendered image of the first page of the document).
The main parts of the embedded documents are often stored within a folder named "/Document" (which may contain subdirectories itself, if the file contains several related documents each of them with various parts), and the optional metadata parts that are not needed for processing the main parts of the document are stored in a folder named "/Metadata"; however these actual folder names are actually specified within the XML-formatted data in "'' artname'.rels" relationship files and the OPC specification allows any folder organisation that is convenient for the application and these two folder names are not required.
Chunking
It encourages documents to be split into small chunks. This is better for reducing the effect of file corruption. And better for data access: for example, all the style information in one XML part, each separate worksheet or table in their own different parts. This allows faster access and less object creation for clients and makes it easier for multiple processes to be working on the same document.
Relative indirection
In the Open Packaging Conventions, each file that has reference has its own ''_rels'' file with the indirection lists. This makes it easier to cut and paste some information with all its associated resources in some cases, provides name scoping to remove the chance of name clashing between files, and so on.
References
External links
Download specification ISO/IEC 29500-2:2012
OPC: A New Standard for Packaging Your Data
Essentials of the Open Packaging Conventions
OPC Digital Signatures: Application Guidelines for Common Criteria Security
Packaging team blog
Open Packaging Conventions (OPC) MSDN Forum
The Addressing Model of the Open Packaging Conventions
OPC implementation test documents
OPC package explorer
to edit XML parts
ISO 19165 Geographic information – Preservation of digital data and metadata – Part 1: Fundamentals
{{CAD software
Computer file formats
Ecma standards
IEC standards
ISO standards
Microsoft initiatives
Office Open XML