HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, a piece table is a
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
typically used to represent a
text document A text file (sometimes spelled textfile; an old alternative name is flatfile) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In operat ...
while it is edited in a
text editor A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software (e.g. Windows Notepad). Text editors are provided with operating systems and software development packages, and can be us ...
. Initially a reference (or 'span') to the whole of the original file is created, which represents the as yet unchanged file. Subsequent inserts and deletes replace a span by combinations of one, two, or three references to sections of either the original document or to a buffer holding inserted text. Typically the text of the original document is held in one immutable block, and the text of each subsequent insert is stored in new immutable blocks. Because even deleted text is still included in the piece table, this makes multi-level or unlimited
undo Undo is an interaction technique which is implemented in many computer programs. It erases the last change done to the document, reverting it to an older state. In some more advanced programs, such as graphic processing, undo will negate the las ...
easier to implement with a piece table than with alternative data structures such as a
gap buffer A gap buffer in computer science is a dynamic array that allows efficient insertion and deletion operations clustered near the same location. Gap buffers are especially common in text editors, where most changes to the text occur at or near the cur ...
. This data structure was invented by
J Strother Moore J Strother Moore (his first name is the alphabetic character "J" – not an abbreviated "J.") is a computer scientist. He is a co-developer of the Boyer–Moore string-search algorithm, Boyer–Moore majority vote algorithm, and the Boyer–Moo ...
. David Lu
"What's been wrought using the Piece Table?"

discussion


Description

For this description, we use
buffer Buffer may refer to: Science * Buffer gas, an inert or nonflammable gas * Buffer solution, a solution used to prevent changes in pH * Buffering agent, the weak acid or base in a buffer solution * Lysis buffer, in cell biology * Metal ion buffer * ...
as the immutable block to hold the contents. A piece table consists of three columns: * Which buffer * Start index in the buffer * Length in the buffer In addition to the table, two buffers are used to handle edits: * "''Original buffer''": A buffer to the original text document. This buffer is read-only. * "''Add buffer''": A buffer to a temporary file. This buffer is append-only.


Operations


Index

''Definition:'' Index(i): return the character at position ''i''
To retrieve the ''i''-th character, the appropriate entry in a piece table is read.


Example

Given the following buffers and piece table: To access the ''i''-th character, the appropriate entry in the piece table is looked up. For instance, to get the value of Index(15), the 3rd entry of piece table is retrieved. This is because the 3rd entry describes the characters from index 12 to 16 (the first entry describes characters in index 0 to 5, the next one is 6 to 11). The piece table entry instructs the program to look for the characters in the "''add file''" buffer, starting at index 18 in that buffer. The relative index in that entry is 15-12 = 3, which is added to the start position of the entry in the buffer to obtain index of the letter: 3+18 = 21. The value of Index(15) is the 21st character of the "add file" buffer, which is the character "o". For the buffers and piece table given above, the following text is shown: Lorem ipsum dolor sit amet


Insert

Inserting characters to the text consists of: * Appending characters to the "add file" buffer, and * Updating the entry in piece table (breaking an entry into two or three)


Delete

Single character deletion can be one of two possible conditions: * The deletion is at the start or end of a piece entry, in which case the appropriate entry in piece table is modified. * The deletion is in the middle of a piece entry, in which case the entry is split then one of the successor entries is modified as above.


Usage

Several
text editor A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software (e.g. Windows Notepad). Text editors are provided with operating systems and software development packages, and can be us ...
s use an in-RAM piece table internally, including
Bravo Bravo(s) or The Bravo(s) may refer to: Arts and entertainment Music Groups and labels *Bravo (band), a Russian rock band * Bravo (Spanish group), represented Spain at Eurovision 1984 *Bravo Music, an American concert band music publishing company ...
,
Abiword AbiWord () is a free and open-source software word processor. It is written in C++ and since version 3 it is based on GTK+ 3. The name "AbiWord" is derived from the root of the Spanish word "'' abierto''", meaning "open".Project MascoAbi the Ant ...
,James Brown
"Piece Chains: Design & Implementation of a Win32 Text Editor"
Atom Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons. Every solid, liquid, gas, and ...
and
Visual Studio Code Visual Studio Code, also commonly referred to as VS Code, is a source-code editor made by Microsoft with the Electron Framework, for Windows, Linux and macOS. Features include support for debugging, syntax highlighting, intelligent code complet ...
. The "fast save" feature in some versions of Microsoft Word uses a piece table for the on-disk file format. The on-disk representation of text files in the Oberon System uses a piece chain technique that allows pieces of one document to point to text stored in some other document, similar to
transclusion In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is normal ...
. Niklaus Wirth, Jürg Gutknecht
"Project Oberon: The Design of an Operating System and Compiler"
. 2005. p. 90.


See also

*
Rope (computer science) In computer programming, a rope, or cord, is a data structure composed of smaller String (computer science), strings that is used to efficiently store and manipulate a very long string. For example, a text editing program may use a rope to represe ...
*
Gap buffer A gap buffer in computer science is a dynamic array that allows efficient insertion and deletion operations clustered near the same location. Gap buffers are especially common in text editors, where most changes to the text occur at or near the cur ...
, a data structure commonly used in text editors that allows efficient insertion and deletion operations clustered near the same location * Enfilade_(Xanadu), the Model-T Enfilade is a piece table with a tree-based implementation.


References

{{reflist String data structures