HOME

TheInfoList



OR:

In computer programming, a rope, or cord, is a
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
composed of smaller strings that is used to efficiently store and manipulate a very long string. For example, a text editing program may use a rope to represent the text being edited, so that operations such as insertion, deletion, and random access can be done efficiently.


Description

A rope is a
binary tree In computer science, a binary tree is a k-ary k = 2 tree data structure in which each node has at most two children, which are referred to as the ' and the '. A recursive definition using just set theory notions is that a (non-empty) binary tr ...
where each leaf (end node) holds a string and a length (also known as a "weight"), and each node further up the tree holds the sum of the lengths of all the leaves in its left
subtree In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be co ...
. A node with two children thus divides the whole string into two parts: the left subtree stores the first part of the string, the right subtree stores the second part of the string, and a node's weight is the length of the first part. For rope operations, the strings stored in nodes are assumed to be constant
immutable object In object-oriented and functional programming, an immutable object (unchangeable object) is an object whose state cannot be modified after it is created.Goetz et al. ''Java Concurrency in Practice''. Addison Wesley Professional, 2006, Section 3 ...
s in the typical nondestructive case, allowing for some copy-on-write behavior. Leaf nodes are usually implemented as basic fixed-length strings with a
reference count In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource, such as an object, a block of memory, disk space, and others. In garbage collection algorithms, refer ...
attached for deallocation when no longer needed, although other
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable m ...
methods can be used as well.


Operations

In the following definitions, ''N'' is the length of the rope.


Collect Leaves

: ''Definition:'' Create a stack ''S'' and a list ''L''. Traverse down the left-most spine of the tree until you reach a leaf l', adding each node ''N'' to ''S''. Add l' to ''L''. The parent of l' (''p'') is at the top of the stack. Repeat the procedure for p's right subtree. final class InOrderRopeIterator implements Iterator


Rebalance

: ''Definition:'' Collect the set of leaves ''L'' and rebuild the tree from the bottom-up. static boolean isBalanced(RopeLike r) static RopeLike rebalance(RopeLike r) static RopeLike merge(List leaves) static RopeLike merge(List leaves, int start, int end)


Insert

: ''Definition:'' Insert(i, S’): insert the string ''S’'' beginning at position ''i'' in the string ''s'', to form a new string . : ''Time complexity:'' . This operation can be done by a Split() and two Concat() operations. The cost is the sum of the three. public Rope insert(int idx, CharSequence sequence)


Index

: ''Definition:'' Index(i): return the character at position ''i'' : ''Time complexity:'' To retrieve the ''i''-th character, we begin a
recursive Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematic ...
search from the root node: @Override public int indexOf(char ch, int startIndex) For example, to find the character at in Figure 2.1 shown on the right, start at the root node (A), find that 22 is greater than 10 and there is a left child, so go to the left child (B). 9 is less than 10, so subtract 9 from 10 (leaving ) and go to the right child (D). Then because 6 is greater than 1 and there's a left child, go to the left child (G). 2 is greater than 1 and there's a left child, so go to the left child again (J). Finally 2 is greater than 1 but there is no left child, so the character at index 1 of the short string "na" (ie "n") is the answer. (1-based index)


Concat

: ''Definition:'' Concat(S1, S2): concatenate two ropes, ''S''1 and ''S''2, into a single rope. : ''Time complexity:'' (or time to compute the root weight) A concatenation can be performed simply by creating a new root node with and , which is constant time. The weight of the parent node is set to the length of the left child ''S''1, which would take time, if the tree is balanced. As most rope operations require balanced trees, the tree may need to be re-balanced after concatenation.


Split

: ''Definition:'' Split (i, S): split the string ''S'' into two new strings ''S''1 and ''S''2, and . : ''Time complexity:'' There are two cases that must be dealt with: # The split point is at the end of a string (i.e. after the last character of a leaf node) # The split point is in the middle of a string. The second case reduces to the first by splitting the string at the split point to create two new leaf nodes, then creating a new node that is the parent of the two component strings. For example, to split the 22-character rope pictured in Figure 2.3 into two equal component ropes of length 11, query the 12th character to locate the node ''K'' at the bottom level. Remove the link between ''K'' and ''G''. Go to the parent of ''G'' and subtract the weight of ''K'' from the weight of ''D''. Travel up the tree and remove any right links to subtrees covering characters past position 11, subtracting the weight of ''K'' from their parent nodes (only node ''D'' and ''A'', in this case). Finally, build up the newly orphaned nodes ''K'' and ''H'' by concatenating them together and creating a new parent ''P'' with weight equal to the length of the left node ''K''. As most rope operations require balanced trees, the tree may need to be re-balanced after splitting. public Pair split(int index)


Delete

: ''Definition:'' Delete(i, j): delete the substring , from ''s'' to form a new string . : ''Time complexity:'' . This operation can be done by two Split() and one Concat() operation. First, split the rope in three, divided by ''i''-th and ''i+j''-th character respectively, which extracts the string to delete in a separate node. Then concatenate the other two nodes. @Override public RopeLike delete(int start, int length)


Report

: ''Definition:'' Report(i, j): output the string . : ''Time complexity:'' To report the string , find the node ''u'' that contains ''Ci'' and , and then traverse ''T'' starting at node ''u''. Output by doing an
in-order traversal In computer science, tree traversal (also known as tree search and walking the tree) is a form of graph traversal and refers to the process of visiting (e.g. retrieving, updating, or deleting) each node in a tree data structure, exactly once. ...
of ''T'' starting at node ''u''.


Comparison with monolithic arrays

Advantages: * Ropes enable much faster insertion and deletion of text than monolithic string arrays, on which operations have time complexity O(n). * Ropes don't require O(n) extra memory when operated upon (arrays need that for copying operations). * Ropes don't require large contiguous memory spaces. * If only nondestructive versions of operations are used, rope is a
persistent data structure In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (v ...
. For the text editing program example, this leads to an easy support for multiple
undo Undo is an interaction technique which is implemented in many computer programs. It erases the last change done to the document, reverting it to an older state. In some more advanced programs, such as graphic processing, undo will negate the las ...
levels. Disadvantages: * Greater overall space use when not being operated on, mainly to store parent nodes. There is a trade-off between how much of the total memory is such overhead and how long pieces of data are being processed as strings. The strings in example figures above are unrealistically short for modern architectures. The overhead is always O(n), but the constant can be made arbitrarily small. * Increase in time to manage the extra storage * Increased complexity of source code; greater risk of bugs This table compares the ''algorithmic'' traits of string and rope implementations, not their ''raw speed''. Array-based strings have smaller overhead, so (for example) concatenation and split operations are faster on small datasets. However, when array-based strings are used for longer strings, time complexity and memory use for inserting and deleting characters becomes unacceptably large. In contrast, a rope data structure has stable performance regardless of data size. Further, the space complexity for ropes and arrays are both O(n). In summary, ropes are preferable when the data is large and modified often.


See also

* The
Cedar Cedar may refer to: Trees and plants *''Cedrus'', common English name cedar, an Old-World genus of coniferous trees in the plant family Pinaceae *Cedar (plant), a list of trees and plants known as cedar Places United States * Cedar, Arizona * ...
programming environment, which used ropes "almost since its inception" * The Model T enfilade, a similar data structure from the early 1970s. *
Gap buffer A gap buffer in computer science is a dynamic array that allows efficient insertion and deletion operations clustered near the same location. Gap buffers are especially common in text editors, where most changes to the text occur at or near the cu ...
, a data structure commonly used in text editors that allows efficient insertion and deletion operations clustered near the same location *
Piece table In computing, a piece table is a data structure typically used to represent a text document while it is edited in a text editor. Initially a reference (or 'span') to the whole of the original file is created, which represents the as yet unchanged f ...
, another data structure commonly used in text editors


References


External links


"absl::Cord" implementation of ropes within The Abseil library"C cords" implementation of ropes within the Boehm Garbage Collector library
(supported by STLPort an


Ropes
for C#
ropes
for Common Lisp
Ropes for JavaString-Like Ropes for JavaRopes for JavaScriptRopes
for
Limbo In Catholic theology, Limbo (Latin ''limbus'', edge or boundary, referring to the edge of Hell) is the afterlife condition of those who die in original sin without being assigned to the Hell of the Damned. Medieval theologians of Western Europ ...

ropes
for
Nim Nim is a mathematical two player game. Nim or NIM may also refer to: * Nim (programming language) Nim is a general-purpose, multi-paradigm, statically typed, compiled systems programming language, designed and developed by a team around And ...

Ropes
for
OCaml OCaml ( , formerly Objective Caml) is a general-purpose, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Di ...

pyropes
for
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pr ...

Ropes
for
Smalltalk Smalltalk is an object-oriented, dynamically typed reflective programming language. It was designed and created in part for educational use, specifically for constructionist learning, at the Learning Research Group (LRG) of Xerox PARC by Alan K ...

SwiftRope
for
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIFT, ...

"Ropey"
for
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
{{DEFAULTSORT:Rope Data Structure Binary trees String data structures