
A data dictionary, or
metadata repository
A metadata repository is a database created to store metadata. Metadata is information about the structures that contain the actual data. Metadata is often said to be "data about data", but this is misleading. Data profiles are an example of actua ...
, as defined in the ''IBM Dictionary of Computing'', is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". ''
Oracle
An oracle is a person or thing considered to provide insight, wise counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. If done through occultic means, it is a form of divination.
Descript ...
'' defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s and
database management system
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and an ...
s (DBMS):
* A
document
A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes ...
describing a database or collection of databases
* An integral
component
Component may refer to:
In engineering, science, and technology Generic systems
*System components, an entity with discrete structure, such as an assembly or software module, within a system considered at a particular level of analysis
* Lumped e ...
of a
DBMS
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and ana ...
that is required to determine its structure
* A piece of
middleware
Middleware is a type of computer software program that provides services to software applications beyond those available from the operating system. It can be described as "software glue".
Middleware makes it easier for software developers to imple ...
that extends or supplants the native data dictionary of a DBMS
Documentation
The terms ''data dictionary'' and ''data repository'' indicate a more general software utility than a catalogue. A ''catalogue'' is closely coupled with the DBMS software. It provides the information stored in it to the user and the DBA, but it is mainly accessed by the various software modules of the DBMS itself, such as
DDL and
DML compilers, the query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a ''data dictionary'' is a data structure that stores
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
, i.e., (structured) data about information. The software package for a stand-alone data dictionary or data repository may interact with the software modules of the DBMS, but it is mainly used by the designers, users and administrators of a computer system for information resource management. These systems maintain information on system hardware and software configuration, documentation, application and users as well as other information relevant to system administration.
If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software, it is called a ''passive data dictionary.'' Otherwise, it is called an ''active data dictionary'' or ''data dictionary.'' When a passive data dictionary is updated, it is done so manually and independently from any changes to a DBMS (database) structure. With an active data dictionary, the dictionary is updated first and changes occur in the DBMS automatically as a result.
Database
users
Ancient Egyptian roles
* User (ancient Egyptian official), an ancient Egyptian nomarch (governor) of the Eighth Dynasty
* Useramen, an ancient Egyptian vizier also called "User"
Other uses
* User (computing), a person (or software) using an ...
and
application developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases. This typically includes the names and descriptions of various
tables (
records or
entities) and their contents (
fields
Fields may refer to:
Music
*Fields (band), an indie rock band formed in 2006
* Fields (progressive rock band), a progressive rock band formed in 1971
* ''Fields'' (album), an LP by Swedish-based indie rock band Junip (2010)
* "Fields", a song by ...
) plus additional details, like the
type
Type may refer to:
Science and technology Computing
* Typing, producing text via a keyboard, typewriter, etc.
* Data type, collection of values used for computations.
* File type
* TYPE (DOS command), a command to display contents of a file.
* ...
and length of each
data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:
# An identification such as a data element name
# A clear data element definition
# One or more representation term ...
. Another important piece of information that a data dictionary can provide is the relationship between tables. This is sometimes referred to in
entity-relationship diagrams (ERDs), or if using set descriptors, identifying which sets database tables participate in.
In an active data dictionary constraints may be placed upon the underlying data. For instance, a range may be imposed on the value of numeric data in a data element (field), or a record in a table may be forced to participate in a set relationship with another record-type. Additionally, a distributed DBMS may have certain location specifics described within its active data dictionary (e.g. where tables are physically located).
The data dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Oracle has a list of specific views for the "sys" user. This allows users to look up the exact information that is needed. Command files contain SQL Statements for
CREATE TABLE
,
CREATE UNIQUE INDEX
,
ALTER TABLE
(for referential integrity), etc., using the specific statement required by that type of database.
There is no universal standard as to the level of detail in such a document.
Middleware
In the construction of database applications, it can be useful to introduce an additional layer of data dictionary software, i.e.
middleware
Middleware is a type of computer software program that provides services to software applications beyond those available from the operating system. It can be described as "software glue".
Middleware makes it easier for software developers to imple ...
, which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative
entity-relationship models tailored to suit different applications that share a common database. Extensions to the data dictionary also can assist in
query optimization
Query optimization is a feature of many relational database management systems and other databases such as NoSQL and graph databases. The query optimizer attempts to determine the most efficient way to execute a given query by considering the po ...
against
distributed database
A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a computer network, netwo ...
s. Additionally, DBA functions are often automated using restructuring tools that are tightly coupled to an active data dictionary.
Software framework
In computer programming, a software framework is a software abstraction that provides generic functionality which developers can extend with custom code to create applications. It establishes a standard foundation for building and deploying soft ...
s aimed at
rapid application development
Rapid application development (RAD), also called rapid application building (RAB), is both a general term for adaptive software development approaches, and the name for James Martin's method of rapid development. In general, RAD approaches to ...
sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build
menus,
forms, reports, and other components of a database application, including the database itself. For example, PHPLens includes a
PHP
PHP is a general-purpose scripting language geared towards web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by the PHP Group. ...
class library to automate the creation of tables, indexes, and
foreign key
A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples ...
constraints
portably for multiple databases. Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program
objects,
scripts, and SQL code for menus and forms with
data validation
In computing, data validation or input validation is the process of ensuring data has undergone data cleansing to confirm it has data quality, that is, that it is both correct and useful. It uses routines, often called "validation rules", "valida ...
and complex
joins Join may refer to:
* Join (law), to include additional counts or additional defendants on an indictment
*In mathematics:
** Join (mathematics), a least upper bound of sets orders in lattice theory
** Join (topology), an operation combining two top ...
. For the
ASP.NET
ASP.NET is a server-side web-application framework designed for web development to produce dynamic web pages. It was developed by Microsoft to allow programmers to build dynamic web sites, applications and services. The name stands for Ac ...
environment,
Base One's data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (
caching and index utilization),
application security
Application security (short AppSec) includes all tasks that introduce a secure software development life cycle to development teams. Its final goal is to improve security practices and, through that, to find, fix and preferably prevent security is ...
, and extended
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s.
Visual DataFlex features provides the ability to use DataDictionaries as class files to form middle layer between the user interface and the underlying database. The intent is to create standardized rules to maintain data integrity and enforce business rules throughout one or more related applications.
Some industries use generalized data dictionaries as technical standards to ensure interoperability between systems. The real estate industry, for example, abides by
RESO's Data Dictionaryto which the
National Association of REALTORS
The National Association of Realtors (NAR) is an American trade association for those who work in the real estate industry. it had over 1.5 million members, making it the largest trade association in the United States including NAR's institute ...
mandates its
MLSs comply with through its policy handbook. This intermediate mapping layer for MLSs' native databases is supported by software companies which provide API services to MLS organizations.
Platform-specific examples
Developers use a
data description specification (DDS) to describe data attributes in file descriptions that are external to the application program that processes the data, in the context of an
IBM i
IBM i (the ''i'' standing for ''integrated'') is an operating system developed by IBM for IBM Power Systems. It was originally released in 1988 as OS/400, as the sole operating system of the IBM AS/400 line of systems. It was renamed to i5/OS in 2 ...
. The ''sys.ts$'' table in Oracle stores information about every table in the database. It is part of the data dictionary that is created when the
Oracle Database
Oracle Database (commonly referred to as Oracle DBMS, Oracle Autonomous Database, or simply as Oracle) is a proprietary multi-model database management system produced and marketed by Oracle Corporation.
It is a database commonly used for ru ...
is created.
Developers may also use DDS context from
free and open-source software
Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...
(FOSS) for structured and transactional queries in open environments.
Typical attributes
Here is a non-exhaustive list of typical items found in a data dictionary for columns or fields:
* Entity or form name or their ID (EntityID or FormID). The group this field belongs to.
* Field name, such as
RDBMS field name
* Displayed field title. May default to field name if blank.
* Field
type
Type may refer to:
Science and technology Computing
* Typing, producing text via a keyboard, typewriter, etc.
* Data type, collection of values used for computations.
* File type
* TYPE (DOS command), a command to display contents of a file.
* ...
(string, integer, date, etc.)
*
Measures such as min and max values, display width, or number of decimal places. Different field types may interpret this differently. An alternative is to have different attributes depending on field type.
* Field display order or tab order
* Coordinates on screen (if a positional or grid-based UI)
* Default value
* Prompt type, such as drop-down list, combo-box, check-boxes, range, etc.
* Is-required (Boolean) - If 'true', the value can not be blank, null, or only white-spaces
* Is-read-only (Boolean)
* Reference table name, if a foreign key. Can be used for validation or selection lists.
* Various event handlers or references to. Example: "on-click", "on-validate", etc. See
event-driven programming
In computer programming, event-driven programming is a programming paradigm in which the Control flow, flow of the program is determined by external Event (computing), events. User interface, UI events from computer mouse, mice, computer keyboard, ...
.
* Format code, such as a
regular expression
A regular expression (shortened as regex or regexp), sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
or COBOL-style "PIC" statements
* Description or synopsis
*
Database index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data withou ...
characteristics or specification
See also
*
Data hierarchy
Data hierarchy refers to the systematic organization of data, often in hierarchical form. Data organization involves characters, fields, records, files and so on. This concept is a starting point when trying to see what makes up data and whether da ...
*
Data modeling
Data modeling in software engineering is the process of creating a data model for an information system by applying certain formal techniques. It may be applied as part of broader Model-driven engineering (MDE) concept.
Overview
Data modeli ...
*
Database catalog
A database catalog of a database instance consists of metadata in which definitions of database objects such as base tables, views (virtual tables), synonyms, value ranges, indexes, users, and user groups are stored. It is an architecture pro ...
*
Database schema
The database schema is the structure of a database described in a formal language supported typically by a relational database management system (RDBMS). The term "wikt:schema, schema" refers to the organization of data as a blueprint of how the ...
*
ISO/IEC 11179
The ISO/IEC 11179 metadata registry (MDR) standard is an international International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard for representing metadata for an organization in a metadata registry ...
*
Metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.
A metadata repository is the database where metadata is stored. The registry also adds relationships with ...
*
Semantic spectrum
*
Vocabulary OneSource
*
Metadata repository
A metadata repository is a database created to store metadata. Metadata is information about the structures that contain the actual data. Metadata is often said to be "data about data", but this is misleading. Data profiles are an example of actua ...
References
External links
*Yourdon, ''Structured Analysis Wiki''
Data Dictionaries (Web archive)*Octopai
Data Dictionary vs. Business Glossary
{{DEFAULTSORT:Data Dictionary
Data management
Data modeling
Knowledge representation
Metadata