Sedna is an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
database management system
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
that provides
native
Native may refer to:
People
* Jus soli, citizenship by right of birth
* Indigenous peoples, peoples with a set of specific rights based on their historical ties to a particular territory
** Native Americans (disambiguation)
In arts and entert ...
storage for
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
data.
The distinctive design decisions employed in Sedna are (i)
schema
The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms.
Schema may refer to:
Science and technology
* SCHEMA ...
-based
clustering storage strategy for XML data and (ii)
memory management
Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when ...
based on layered
address space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.
For software programs to save and retrieve st ...
.
[Ilya Taranov et al. Sedna: native XML database management system (internals overview). In ACM SIGMOD '10: Proceedings of the 36th international conference on Association for Computing Machinery's Special Interest Group on Management of Data, pages 1037-1045, New York, NY, USA, 2010. ACM.]
Data organization
Data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
organization in Sedna is designed with the goal of providing a balance in
performance
A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function.
Management science
In the work place ...
between XML queries and updates execution.
The two primary design decisions in data organization in Sedna are:
# ''Direct''
pointers are used to represent XML node relationships such as parent, child, and sibling ones. Unlike
relational-based approaches that require performing
joins Join may refer to:
* Join (law), to include additional counts or additional defendants on an indictment
*In mathematics:
** Join (mathematics), a least upper bound of sets orders in lattice theory
** Join (topology), an operation combining two topo ...
for traversing an XML document, traversing in Sedna is performed by simply following a direct pointer.
# A ''descriptive
schema
The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms.
Schema may refer to:
Science and technology
* SCHEMA ...
-driven
storage strategy'' is developed which consists of
clustering nodes
In general, a node is a localized swelling (a "knot") or a point of intersection (a Vertex (graph theory), vertex).
Node may refer to:
In mathematics
*Vertex (graph theory), a vertex in a mathematical graph
*Vertex (geometry), a point where two ...
of an XML document according to their positions in the descriptive schema of the document. In contrast to a prescriptive schema that is known in advance and is usually specified in
DTD or
XML Schema
An XML schema is a description of a type of Extensible Markup Language, XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed ...
, the descriptive schema is generated from data dynamically (and is maintained
incrementally) and represents a concise and an accurate
structure
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
summary for data. Using the descriptive schema instead of the prescriptive one makes the storage strategy applicable to any XML document, even a one that comes with no prescriptive schema.
The following figure illustrates the overall principles of data organization in Sedna.
The descriptive schema represented as a
tree
In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
of schema nodes is the central component in the data organization.
Each schema node is labeled with an XML node kind
[M.F. Fernandez, A. Malhotra, J. Marsh, M.Nagy, and N. Walsh (editors). ]XQuery
XQuery (XML Query) is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats (JSON, bin ...
1.0 and XPath
XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) and can be used to compute values (e.g., strings, numbers, or Boolean v ...
2.0 Data Model
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be co ...
(XDM). W3C Recommendation
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
, World Wide Web Consortium
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
, January 2007. (e.g. ''element'', ''attribute'', ''text'', etc.) and has a
pointer to data
blocks that store XML
nodes
In general, a node is a localized swelling (a "knot") or a point of intersection (a Vertex (graph theory), vertex).
Node may refer to:
In mathematics
*Vertex (graph theory), a vertex in a mathematical graph
*Vertex (geometry), a point where two ...
corresponding to the given schema node.
Depending on their node kind, some schema nodes are also labeled with
names
A name is a term used for identification by an external observer. They can identify a class or category of things, or a single thing, either uniquely, or within a given context. The entity identified by a name is called its referent. A persona ...
(e.g., element nodes, attribute nodes).
Data blocks related to a common schema node are linked via
pointers into a bidirectional
list
A ''list'' is any set of items in a row. List or lists may also refer to:
People
* List (surname)
Organizations
* List College, an undergraduate division of the Jewish Theological Seminary of America
* SC Germania List, German rugby union ...
. Node descriptors in a list of blocks are
partly ordered according to document order.
[S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon (editors). XQuery 1.0: An XML query language. ]W3C recommendation
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
, World Wide Web Consortium
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
, January 2007
Citations
{{reflist
External links
Sedna project homepage
Free database management systems
XML databases
Database-related software for Linux