COM Structured Storage
   HOME

TheInfoList



OR:

COM Structured Storage (variously also known as '' COM structured storage'' or '' OLE structured storage'') is a technology developed by
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
as part of its
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
for storing hierarchical data within a single file. Strictly speaking, the term ''structured storage'' refers to a set of COM interfaces that a conforming implementation must provide, and not to a specific implementation, nor to a specific
file File or filing may refer to: Mechanical tools and processes * File (tool), a tool used to ''remove'' fine amounts of material from a workpiece **Filing (metalworking), a material removal process in manufacturing ** Nail file, a tool used to gent ...
format (in fact, a structured storage implementation need not store its data in a file at all). In addition to providing a hierarchical structure for data, structured storage may also provide a limited form of transactional support for data access. Microsoft provides an implementation that supports transactions, as well as one that does not (called ''simple-mode'' storage, the latter implementation is limited in other ways as well, although it performs better). Structured storage is widely used in
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
applications, although newer releases (starting with
Office 2007 Microsoft Office 2007 (codenamed Office 12) is an office suite for Windows, developed and published by Microsoft. It was officially revealed on March 9, 2006 and was the 12th version of Microsoft Office. It was released to manufacturing on Novemb ...
) use the
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
-based
Office Open XML Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version a ...
by default. It is also an important part of both COM and the related Object Linking and Embedding (OLE) technologies. Other notable applications of structured storage include SQL Server, the Windows shell, and many third-party
CAD Computer-aided design (CAD) is the use of computers (or ) to aid in the creation, modification, analysis, or optimization of a design. This software is used to increase the productivity of the designer, improve the quality of design, improve co ...
programs.


Motivation

Structured storage addresses some inherent difficulties of storing multiple data objects within a single file. One difficulty arises when an object persisted in the file changes in size due to an update. If the application that is reading/writing the file expects the objects in the file to remain in a certain order, everything following that object's representation in the file may need to be shifted backward to make room if the object grows, or forward to fill in the space left over if the object shrinks. If the file is large, this could result in a costly operation. Of course, there are many possible solutions to this difficulty, but often the application programmer does not want to deal with low level details such as binary file formats. Structured storage provides an abstraction known as a ''stream'', represented by the interface IStream. A stream is conceptually very similar to a file, and the IStream interface provides methods for reading and writing similar to file input/output. A stream could reside in
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
, within a file, within another stream, etc., depending on the implementation. Another important abstraction is that of a ''storage'', represented by the interface IStorage. A storage is conceptually very similar to a
directory Directory may refer to: * Directory (computing), or folder, a file system structure in which to store computer files * Directory (OpenVMS command) * Directory service, a software application for organizing information about a computer network's u ...
on a
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
. Storages can contain streams, as well as other storages. If an application wishes to persist several data objects to a file, one way to do so would be to open an IStorage that represents the contents of that file and save each of the objects within a single IStream. One way to accomplish the latter is through the standard COM interface IPersistStream. OLE depends heavily on this model to embed objects within documents.


Format

Microsoft's implementation uses a file format known as ''compound files'', and all of the widely deployed structured storage implementations read and write this format. Compound files use a
FAT In nutrition science, nutrition, biology, and chemistry, fat usually means any ester of fatty acids, or a mixture of such chemical compound, compounds, most commonly those that occur in living beings or in food. The term often refers spec ...
-like structure to represent storages and streams. Chunks of the file, known as ''sectors'' (these may or may not correspond to sectors of the underlying file system), are allocated as needed to add new streams and to increase the size of existing streams. If streams are deleted or shrink, leaving unallocated sectors, those sectors can be reused for new streams. The following applications use the OLE Structured Storage (Compound Document Format) *
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
972003 documents: **
Word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
documents (.DOC, .DOT) **
Excel ExCeL London (an abbreviation for Exhibition Centre London) is an exhibition centre, international convention centre and former hospital in the Custom House area of Newham, East London. It is situated on a site on the northern quay of the ...
spreadsheets (.XLS, .XLT) **
PowerPoint Microsoft PowerPoint is a presentation program, created by Robert Gaskins and Dennis Austin at a software company named Forethought, Inc. It was released on April 20, 1987, initially for Macintosh computers only. Microsoft acquired PowerPoi ...
presentations (.PPT, .POT) **
Publisher Publishing is the activity of making information, literature, music, software and other content available to the public for sale or for free. Traditionally, the term refers to the creation and distribution of printed works, such as books, newsp ...
files (.PUB) ** Visio files (.VSD) **
Project A project is any undertaking, carried out individually or collaboratively and possibly involving research or design, that is carefully planned to achieve a particular goal. An alternative view sees a project managerially as a sequence of even ...
files (.MPP) **
Microsoft PhotoDraw Microsoft PhotoDraw 2000 is a discontinued vector graphics and raster imaging software package developed by Microsoft. History Microsoft PhotoDraw 2000 Microsoft PhotoDraw 2000 was released in 1999 along with Microsoft Office 2000 Premium and D ...
files (.MIX) **
Microsoft Outlook Microsoft Outlook is a personal information manager software system from Microsoft, available as a part of the Microsoft Office and Microsoft 365 software suites. Though primarily an email client, Outlook also includes such functions as Calen ...
files (.MSG) *
Windows Installer Windows Installer (msiexec.exe, previously known as Microsoft Installer, codename Darwin) is a software component and application programming interface (API) of Microsoft Windows used for the installation, maintenance, and removal of software. ...
files (.MSI, .MSP, .MST) * Microsoft Picture It! / Microsoft Digital Image files (.MIX) * Internet Explorer RSS Feeds
Windows RSS Platform Windows RSS Platform, included in Internet Explorer 7 and later and Windows Vista and later is a platform that exposes feed handling and management to Windows applications. The RSS support in Internet Explorer is built on the Windows RSS Platform. ...
files (.feed-ms) * Windows 7 StickyNotes (.SNT) * Windows 7 jumplists files * Thumbs.db * Microsoft SQL 2000 Server DTS packages * Autodesk Revit * Autodesk Inventor *
FlashPix FlashPix is a bitmapped computer graphics file format where the image is saved in more than one resolution. Its design anticipated that when an HTTP request is sent for the file by a browser plugin implementing the format, only the image compatibl ...
*
Altium Designer Altium Designer (AD) is a PCB and electronic design automation software package for printed circuit boards. It is developed by Australian software company Altium Limited. History Altium Designer was originally launched in 2005 by Altium, ...


Native Structured Storage

During the
beta testing Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to apprecia ...
phase of
Windows 2000 Windows 2000 is a major release of the Windows NT operating system developed by Microsoft and oriented towards businesses. It was the direct successor to Windows NT 4.0, and was Software release life cycle#Release to manufacturing (RTM), releas ...
, it included a feature titled Native Structured Storage (NSS) for storage of Structured Storage documents (like the binary
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
formats and the thumbs.db file
Windows Explorer File Explorer, previously known as Windows Explorer, is a file manager application that is included with releases of the Microsoft Windows operating system from Windows 95 onwards. It provides a graphical user interface for accessing the file ...
uses to cache thumbnails) with each ''Stream'' that makes up a document stored in a separate
NTFS New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred fil ...
data stream In connection-oriented communication, a data stream is the transmission of a sequence of digitally encoded coherent signals to convey information. Typically, the transmitted symbols are grouped into a series of packets. Data streaming has bec ...
. It included utilities that automatically split up the streams in a regular Structured Storage document into NTFS data streams and vice versa. However, the feature was withdrawn after Beta 3 due to incompatibilities with other OS components, and any NSS files automatically converted to the single data stream format.


Implementations

* For Microsoft .NET: *
OpenMCDF
– Free .NET component for accessing OLE structured storage files, MPL licensed. *For Linux: *
GNOME Structured File Library
– Can read Microsoft structured storage files. *
POLE
* Cross platform C++ for Window/MacOSX/Linux: *
POLE v3 and up
* For Java: *

– Java implementation of the OLE 2 Compound Document format, part of
Apache POI The Apache () are a group of culturally related Native American tribes in the Southwestern United States, which include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Ndendahe (Bedonkohe or Mogollon and Nednhi or Carrizaleño and ...
. * For Perl: *
LAOLA Binary Structures
* For JavaScript: *
js-cfb
– JavaScript implementation of the OLE 2 Compound Document format. * For Python: *
compoundfiles
– Python implementation of the Microsoft Compound File Binary (CFB) format.


References


External links

*
Open Specifications: Compound File Binary File Format
{{Windows Components Microsoft application programming interfaces Computer file formats