Data Abstraction
   HOME

TheInfoList



OR:

In
software engineering Software engineering is a systematic engineering approach to software development. A software engineer is a person who applies the principles of software engineering to design, develop, maintain, test, and evaluate computer software. The term '' ...
and
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, abstraction is: * The process of removing or generalizing physical, spatial, or temporal details or
attributes Attribute may refer to: * Attribute (philosophy), an extrinsic property of an object * Attribute (research), a characteristic of an object * Grammatical modifier, in natural languages * Attribute (computing), a specification that defines a prope ...
in the study of objects or systems to focus attention on details of greater importance; it is similar in nature to the process of
generalization A generalization is a form of abstraction whereby common properties of specific instances are formulated as general concepts or claims. Generalizations posit the existence of a domain or set of elements, as well as one or more common characte ...
; * the creation of abstract
concept Concepts are defined as abstract ideas. They are understood to be the fundamental building blocks of the concept behind principles, thoughts and beliefs. They play an important role in all aspects of cognition. As such, concepts are studied by ...
-
objects Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an ...
by mirroring common features or attributes of various non-abstract objects or systems of study – the result of the process of abstraction. Abstraction, in general, is a fundamental concept in computer science and
software development Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development invol ...
. The process of abstraction can also be referred to as modeling and is closely related to the concepts of ''
theory A theory is a rational type of abstract thinking about a phenomenon, or the results of such thinking. The process of contemplative and rational thinking is often associated with such processes as observational study or research. Theories may be ...
'' and ''
design A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
''.
Models A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
can also be considered types of abstractions per their generalization of aspects of
reality Reality is the sum or aggregate of all that is real or existent within a system, as opposed to that which is only imaginary. The term is also used to refer to the ontological status of things, indicating their existence. In physical terms, r ...
. Abstraction in computer science is closely related to abstraction in mathematics due to their common focus on building abstractions as objects, but is also related to other notions of abstraction used in other fields such as art. Abstractions may also refer to real-world objects and systems, rules of computational systems or rules of
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s that carry or utilize features of abstraction itself, such as: * the usage of data types to perform ''data abstraction'' to separate usage from working representations of
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
s within programs; * the concept of procedures, functions, or subroutines which represent a specific of implementing
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''im ...
in programs; * the rules commonly named "abstraction" that generalize expressions using
free and bound variables In mathematics, and in other disciplines involving formal languages, including mathematical logic and computer science, a free variable is a notation (symbol) that specifies places in an expression where substitution may take place and is not ...
in the various versions of
lambda calculus Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation ...
; * the usage of
S-expression In computer programming, an S-expression (or symbolic expression, abbreviated as sexpr or sexp) is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming la ...
s as an abstraction of data structures and programs in the
Lisp programming language Lisp (historically LISP) is a family of programming languages with a long history and a distinctive, fully parenthesized prefix notation. Originally specified in 1960, Lisp is the second-oldest high-level programming language still in common u ...
; * the process of reorganizing common behavior from non-abstract classes into "abstract classes" using
inheritance Inheritance is the practice of receiving private property, Title (property), titles, debts, entitlements, Privilege (law), privileges, rights, and Law of obligations, obligations upon the death of an individual. The rules of inheritance differ ...
to abstract over sub-classes as seen in the
object-oriented Object-oriented programming (OOP) is a programming paradigm based on the concept of " objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of p ...
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
and
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
programming languages.


Rationale

Computing mostly operates independently of the concrete world. The hardware implements a
model of computation In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes how ...
that is interchangeable with others. The software is structured in
architecture Architecture is the art and technique of designing and building, as distinguished from the skills associated with construction. It is both the process and the product of sketching, conceiving, planning, designing, and constructing building ...
s to enable humans to create the enormous systems by concentrating on a few issues at a time. These architectures are made of specific choices of abstractions. Greenspun's Tenth Rule is an
aphorism An aphorism (from Greek ἀφορισμός: ''aphorismos'', denoting 'delimitation', 'distinction', and 'definition') is a concise, terse, laconic, or memorable expression of a general truth or principle. Aphorisms are often handed down by ...
on how such an architecture is both inevitable and complex. A central form of abstraction in computing is language abstraction: new artificial languages are developed to express specific aspects of a system. ''
Modeling languages A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the st ...
'' help in planning. ''
Computer language A computer language is a formal language used to communicate with a computer. Types of computer languages include: * Construction language – all forms of communication by which a human can specify an executable problem solution to a comput ...
s'' can be processed with a computer. An example of this abstraction process is the generational development of
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s from the
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
to the
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
and the
high-level language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, ...
. Each stage can be used as a stepping stone for the next stage. The language abstraction continues for example in
scripting language A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled. A scripting ...
s and
domain-specific programming language A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a wide variety of DSLs, ranging ...
s. Within a programming language, some features let the programmer create new abstractions. These include
subroutine In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions may ...
s,
modules Broadly speaking, modularity is the degree to which a system's components may be separated and recombined, often with the benefit of flexibility and variety in use. The concept of modularity is used primarily to reduce complexity by breaking a s ...
, polymorphism, and
software component Component-based software engineering (CBSE), also called component-based development (CBD), is a branch of software engineering that emphasizes the separation of concerns with respect to the wide-ranging functionality available throughout a give ...
s. Some other abstractions such as
software design pattern In software engineering, a software design pattern is a general, reusable solution to a commonly occurring problem within a given context in software design. It is not a finished design that can be transformed directly into source or machine co ...
s and
architectural styles An architectural style is a set of characteristics and features that make a building or other structure notable or historically identifiable. It is a sub-class of style in the visual arts generally, and most styles in architecture relate closely ...
remain invisible to a
translator Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
and operate only in the design of a system. Some abstractions try to limit the range of concepts a programmer needs to be aware of, by completely hiding the abstractions that they in turn are built on. The software engineer and writer
Joel Spolsky Avram Joel Spolsky (born 1965) is a software engineer and writer. He is the author of ''Joel on Software'', a blog on software development, and the creator of the project management software Trello. He was a Program Manager on the Microsoft Exce ...
has criticised these efforts by claiming that all abstractions are '' leaky'' – that they can never completely hide the details below; however, this does not negate the usefulness of abstraction. Some abstractions are designed to inter-operate with other abstractions – for example, a programming language may contain a
foreign function interface A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written in another. Naming The term comes from the specification for Common Lisp, which explicit ...
for making calls to the lower-level language.


Abstraction features


Programming languages

Different programming languages provide different types of abstraction, depending on the intended applications for the language. For example: * In
object-oriented programming language Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
s such as
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
,
Object Pascal Object Pascal is an extension to the programming language Pascal (programming language), Pascal that provides object-oriented programming (OOP) features such as Class (computer programming), classes and Method (computer programming), methods. ...
, or
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
, the concept of abstraction has itself become a declarative statement – using the
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituency) ...
''function''(''parameters'') = 0; (in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
) or the
keyword Keyword may refer to: Computing * Keyword (Internet search), a word or phrase typically used by bloggers or online content creator to rank a web page on a particular topic * Index term, a term used as a keyword to documents in an information syst ...
s ''abstract'' and ''interface'' (in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
). After such a declaration, it is the responsibility of the programmer to implement a
class Class or The Class may refer to: Common uses not otherwise categorized * Class (biology), a taxonomic rank * Class (knowledge representation), a collection of individuals or objects * Class (philosophy), an analytical concept used differentl ...
to instantiate the
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an ...
of the declaration. *
Functional programming language In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that ...
s commonly exhibit abstractions related to functions, such as
lambda abstraction Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation tha ...
s (making a term into a function of some variable) and higher-order functions (parameters are functions). * Modern members of the Lisp programming language family such as
Clojure Clojure (, like ''closure'') is a dynamic and functional dialect of the Lisp programming language on the Java platform. Like other Lisp dialects, Clojure treats code as data and has a Lisp macro system. The current development process is comm ...
,
Scheme A scheme is a systematic plan for the implementation of a certain idea. Scheme or schemer may refer to: Arts and entertainment * ''The Scheme'' (TV series), a BBC Scotland documentary series * The Scheme (band), an English pop band * ''The Schem ...
and
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fro ...
support macro systems to allow syntactic abstraction. Other programming languages such as Scala also have macros, or very similar
metaprogramming Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyze or transform other programs, and even modify itself ...
features (for example,
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
has Template Haskell, and
OCaml OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
has MetaOCaml). These can allow a programmer to eliminate boilerplate code, abstract away tedious function call sequences, implement new control flow structures, and implement Domain Specific Languages (DSLs), which allow domain-specific concepts to be expressed in concise and elegant ways. All of these, when used correctly, improve both the programmer's efficiency and the clarity of the code by making the intended purpose more explicit. A consequence of syntactic abstraction is also that any Lisp dialect and in fact almost any programming language can, in principle, be implemented in any modern Lisp with significantly reduced (but still non-trivial in most cases) effort when compared to "more traditional" programming languages such as
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, C or
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
.


Specification methods

Analysts have developed various methods to formally specify software systems. Some known methods include: * Abstract-model based method (VDM, Z); * Algebraic techniques (Larch, CLEAR, OBJ, ACT ONE, CASL); * Process-based techniques (LOTOS, SDL, Estelle); * Trace-based techniques (SPECIAL, TAM); * Knowledge-based techniques (Refine, Gist).


Specification languages

Specification languages generally rely on abstractions of one kind or another, since specifications are typically defined earlier in a project, (and at a more abstract level) than an eventual implementation. The
UML The Unified Modeling Language (UML) is a general-purpose, developmental modeling language in the field of software engineering that is intended to provide a standard way to visualize the design of a system. The creation of UML was originally m ...
specification language, for example, allows the definition of ''abstract'' classes, which in a waterfall project, remain abstract during the architecture and specification phase of the project.


Control abstraction

Programming languages offer control abstraction as one of the main purposes of their use. Computer machines understand operations at the very low level such as moving some bits from one location of the memory to another location and producing the sum of two sequences of bits. Programming languages allow this to be done in the higher level. For example, consider this statement written in a Pascal-like fashion: :a := (1 + 2) * 5 To a human, this seems a fairly simple and obvious calculation (''"one plus two is three, times five is fifteen"''). However, the low-level steps necessary to carry out this evaluation, and return the value "15", and then assign that value to the variable "a", are actually quite subtle and complex. The values need to be converted to binary representation (often a much more complicated task than one would think) and the calculations decomposed (by the compiler or interpreter) into assembly instructions (again, which are much less intuitive to the programmer: operations such as shifting a binary register left, or adding the binary complement of the contents of one register to another, are simply not how humans think about the abstract arithmetical operations of addition or multiplication). Finally, assigning the resulting value of "15" to the variable labeled "a", so that "a" can be used later, involves additional 'behind-the-scenes' steps of looking up a variable's label and the resultant location in physical or virtual memory, storing the binary representation of "15" to that memory location, etc. Without control abstraction, a programmer would need to specify ''all'' the register/binary-level steps each time they simply wanted to add or multiply a couple of numbers and assign the result to a variable. Such duplication of effort has two serious negative consequences: # it forces the programmer to constantly repeat fairly common tasks every time a similar operation is needed # it forces the programmer to program for the particular hardware and instruction set


Structured programming

Structured programming involves the splitting of complex program tasks into smaller pieces with clear flow-control and interfaces between components, with a reduction of the complexity potential for side-effects. In a simple program, this may aim to ensure that loops have single or obvious exit points and (where possible) to have single exit points from functions and procedures. In a larger system, it may involve breaking down complex tasks into many different modules. Consider a system which handles payroll on ships and at shore offices: * The uppermost level may feature a menu of typical end-user operations. * Within that could be standalone executables or libraries for tasks such as signing on and off employees or printing checks. * Within each of those standalone components there could be many different source files, each containing the program code to handle a part of the problem, with only selected interfaces available to other parts of the program. A sign on program could have source files for each data entry screen and the database interface (which may itself be a standalone third party library or a statically linked set of library routines). *Either the database or the payroll application also has to initiate the process of exchanging data with between ship and shore, and that data transfer task will often contain many other components. These layers produce the effect of isolating the implementation details of one component and its assorted internal methods from the others. Object-oriented programming embraces and extends this concept.


Data abstraction

Data abstraction enforces a clear separation between the ''abstract'' properties of a data type and the ''concrete'' details of its implementation. The abstract properties are those that are visible to client code that makes use of the data type—the ''interface'' to the data type—while the concrete implementation is kept entirely private, and indeed can change, for example to incorporate efficiency improvements over time. The idea is that such changes are not supposed to have any impact on client code, since they involve no difference in the abstract behaviour. For example, one could define an
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
called ''lookup table'' which uniquely associates ''keys'' with ''values'', and in which values may be retrieved by specifying their corresponding keys. Such a lookup table may be implemented in various ways: as a
hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
, a
binary search tree In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and ...
, or even a simple linear
list A ''list'' is any set of items in a row. List or lists may also refer to: People * List (surname) Organizations * List College, an undergraduate division of the Jewish Theological Seminary of America * SC Germania List, German rugby union ...
of (key:value) pairs. As far as client code is concerned, the abstract properties of the type are the same in each case. Of course, this all relies on getting the details of the interface right in the first place, since any changes there can have major impacts on client code. As one way to look at this: the interface forms a ''contract'' on agreed behaviour between the data type and client code; anything not spelled out in the contract is subject to change without notice.


Manual data abstraction

While much of data abstraction occurs through computer science and automation, there are times when this process is done manually and without programming intervention. One way this can be understood is through data abstraction within the process of conducting a systematic review of the literature. In this methodology, data is abstracted by one or several abstractors when conducting a
meta-analysis A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting me ...
, with errors reduced through dual data abstraction followed by independent checking, known as
adjudication Adjudication is the legal process by which an arbiter or judge reviews evidence and argumentation, including legal reasoning set forth by opposing parties or litigants, to come to a decision which determines rights and obligations between the p ...
.


Abstraction in object oriented programming

In
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pr ...
theory, abstraction involves the facility to define objects that represent abstract "actors" that can perform work, report on and change their state, and "communicate" with other objects in the system. The term encapsulation refers to the hiding of
state State may refer to: Arts, entertainment, and media Literature * ''State Magazine'', a monthly magazine published by the U.S. Department of State * ''The State'' (newspaper), a daily newspaper in Columbia, South Carolina, United States * ''Our S ...
details, but extending the concept of ''data type'' from earlier programming languages to associate ''behavior'' most strongly with the data, and standardizing the way that different data types interact, is the beginning of abstraction. When abstraction proceeds into the operations defined, enabling objects of different types to be substituted, it is called polymorphism. When it proceeds in the opposite direction, inside the types or classes, structuring them to simplify a complex set of relationships, it is called delegation or
inheritance Inheritance is the practice of receiving private property, Title (property), titles, debts, entitlements, Privilege (law), privileges, rights, and Law of obligations, obligations upon the death of an individual. The rules of inheritance differ ...
. Various object-oriented programming languages offer similar facilities for abstraction, all to support a general strategy of polymorphism in object-oriented programming, which includes the substitution of one type for another in the same or similar role. Although not as generally supported, a configuration or image or package may predetermine a great many of these bindings at
compile-time In computer science, compile time (or compile-time) describes the time window during which a computer program is compiled. The term is used as an adjective to describe concepts related to the context of program compilation, as opposed to concept ...
, link-time, or loadtime. This would leave only a minimum of such bindings to change at run-time.
Common Lisp Object System The Common Lisp Object System (CLOS) is the facility for object-oriented programming which is part of ANSI Common Lisp. CLOS is a powerful dynamic object system which differs radically from the OOP facilities found in more static languages such ...
or
Self The self is an individual as the object of that individual’s own reflective consciousness. Since the ''self'' is a reference by a subject to the same subject, this reference is necessarily subjective. The sense of having a self—or ''selfhoo ...
, for example, feature less of a class-instance distinction and more use of delegation for polymorphism. Individual objects and functions are abstracted more flexibly to better fit with a shared functional heritage from Lisp. C++ exemplifies another extreme: it relies heavily on
templates Template may refer to: Tools * Die (manufacturing), used to cut or shape material * Mold, in a molding process * Stencil, a pattern or overlay used in graphic arts (drawing, painting, etc.) and sewing to replicate letters, shapes or designs Co ...
and overloading and other static bindings at compile-time, which in turn has certain flexibility problems. Although these examples offer alternate strategies for achieving the same abstraction, they do not fundamentally alter the need to support abstract nouns in code – all programming relies on an ability to abstract verbs as functions, nouns as data structures, and either as processes. Consider for example a sample
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
fragment to represent some common farm "animals" to a level of abstraction suitable to model simple aspects of their hunger and feeding. It defines an Animal class to represent both the state of the animal and its functions: public class Animal extends LivingThing With the above definition, one could create objects of type and call their methods like this: thePig = new Animal(); theCow = new Animal(); if (thePig.isHungry()) if (theCow.isHungry()) theCow.moveTo(theBarn); In the above example, the class ''Animal'' is an abstraction used in place of an actual animal, ''LivingThing'' is a further abstraction (in this case a generalisation) of ''Animal''. If one requires a more differentiated hierarchy of animals – to differentiate, say, those who provide milk from those who provide nothing except meat at the end of their lives – that is an intermediary level of abstraction, probably DairyAnimal (cows, goats) who would eat foods suitable to giving good milk, and MeatAnimal (pigs, steers) who would eat foods to give the best meat-quality. Such an abstraction could remove the need for the application coder to specify the type of food, so they could concentrate instead on the feeding schedule. The two classes could be related using
inheritance Inheritance is the practice of receiving private property, Title (property), titles, debts, entitlements, Privilege (law), privileges, rights, and Law of obligations, obligations upon the death of an individual. The rules of inheritance differ ...
or stand alone, and the programmer could define varying degrees of polymorphism between the two types. These facilities tend to vary drastically between languages, but in general each can achieve anything that is possible with any of the others. A great many operation overloads, data type by data type, can have the same effect at compile-time as any degree of inheritance or other means to achieve polymorphism. The class notation is simply a coder's convenience.


Object-oriented design

Decisions regarding what to abstract and what to keep under the control of the coder become the major concern of object-oriented design and
domain analysis In software engineering, domain analysis, or product line analysis, is the process of analyzing related software systems in a domain to find their common and variable parts. It is a model of wider business context for the system. The term was coine ...
—actually determining the relevant relationships in the real world is the concern of object-oriented analysis or legacy analysis. In general, to determine appropriate abstraction, one must make many small decisions about scope (domain analysis), determine what other systems one must cooperate with (legacy analysis), then perform a detailed object-oriented analysis which is expressed within project time and budget constraints as an object-oriented design. In our simple example, the domain is the barnyard, the live pigs and cows and their eating habits are the legacy constraints, the detailed analysis is that coders must have the flexibility to feed the animals what is available and thus there is no reason to code the type of food into the class itself, and the design is a single simple Animal class of which pigs and cows are instances with the same functions. A decision to differentiate DairyAnimal would change the detailed analysis but the domain and legacy analysis would be unchanged—thus it is entirely under the control of the programmer, and it is called an abstraction in object-oriented programming as distinct from abstraction in domain or legacy analysis.


Considerations

When discussing
formal semantics of programming languages In programming language theory, semantics is the rigorous mathematical study of the meaning of programming languages. Semantics assigns computational meaning to valid strings in a programming language syntax. Semantics describes the processes ...
,
formal methods In computer science, formal methods are mathematically rigorous techniques for the specification, development, and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the exp ...
or abstract interpretation, abstraction refers to the act of considering a less detailed, but safe, definition of the observed program behaviors. For instance, one may observe only the final result of program executions instead of considering all the intermediate steps of executions. Abstraction is defined to a concrete (more precise) model of execution. Abstraction may be exact or faithful with respect to a property if one can answer a question about the property equally well on the concrete or abstract model. For instance, if one wishes to know what the result of the evaluation of a mathematical expression involving only integers +, -, ×, is worth modulo ''n'', then one needs only perform all operations modulo ''n'' (a familiar form of this abstraction is
casting out nines Casting out nines is any of three arithmetical procedures: *Adding the decimal digits of a positive whole number, while optionally ignoring any 9s or digits which sum to a multiple of 9. The result of this procedure is a number which is smaller th ...
). Abstractions, however, though not necessarily exact, should be sound. That is, it should be possible to get sound answers from them—even though the abstraction may simply yield a result of undecidability. For instance, students in a class may be abstracted by their minimal and maximal ages; if one asks whether a certain person belongs to that class, one may simply compare that person's age with the minimal and maximal ages; if his age lies outside the range, one may safely answer that the person does not belong to the class; if it does not, one may only answer "I don't know". The level of abstraction included in a programming language can influence its overall usability. The
Cognitive dimensions Cognitive dimensions or cognitive dimensions of notations are design principles for notations, user interfaces and programming languages, described by researcher Thomas R.G. Green and further researched with Marian Petre. The dimensions can be ...
framework includes the concept of ''abstraction gradient'' in a formalism. This framework allows the designer of a programming language to study the trade-offs between abstraction and other characteristics of the design, and how changes in abstraction influence the language usability. Abstractions can prove useful when dealing with computer programs, because non-trivial properties of computer programs are essentially undecidable (see
Rice's theorem In computability theory, Rice's theorem states that all non-trivial semantic properties of programs are undecidable. A semantic property is one about the program's behavior (for instance, does the program terminate for all inputs), unlike a synta ...
). As a consequence, automatic methods for deriving information on the behavior of computer programs either have to drop termination (on some occasions, they may fail, crash or never yield out a result), soundness (they may provide false information), or precision (they may answer "I don't know" to some questions). Abstraction is the core concept of abstract interpretation. Model checking generally takes place on abstract versions of the studied systems.


Levels of abstraction

Computer science commonly presents ''levels'' (or, less commonly, ''layers'') of abstraction, wherein each level represents a different model of the same information and processes, but with varying amounts of detail. Each level uses a system of expression involving a unique set of objects and compositions that apply only to a particular domain.
Luciano Floridi Luciano Floridi (; born 16 November 1964) is an Italian and British philosopher. He holds a double appointment as professor of philosophy and ethics of information at the University of Oxford, Oxford Internet Institute where is also Governin ...

''Levellism and the Method of Abstraction''
IEG – Research Report 22.11.04
Each relatively abstract, "higher" level builds on a relatively concrete, "lower" level, which tends to provide an increasingly "granular" representation. For example, gates build on electronic circuits, binary on gates, machine language on binary, programming language on machine language, applications and operating systems on programming languages. Each level is embodied, but not determined, by the level beneath it, making it a language of description that is somewhat self-contained.


Database systems

Since many users of database systems lack in-depth familiarity with computer data-structures, database developers often hide complexity through the following levels: Physical level: The lowest level of abstraction describes ''how'' a system actually stores data. The physical level describes complex low-level data structures in detail. Logical level: The next higher level of abstraction describes ''what'' data the database stores, and what relationships exist among those data. The logical level thus describes an entire database in terms of a small number of relatively simple structures. Although implementation of the simple structures at the logical level may involve complex physical level structures, the user of the logical level does not need to be aware of this complexity. This is referred to as physical data independence.
Database administrator Database administrators (DBAs) use specialized software to store and organize data. The role may include capacity planning, installation, configuration, database design, migration, performance monitoring, security, troubleshooting, as well as ba ...
s, who must decide what information to keep in a database, use the logical level of abstraction. View level: The highest level of abstraction describes only part of the entire database. Even though the logical level uses simpler structures, complexity remains because of the variety of information stored in a large database. Many users of a database system do not need all this information; instead, they need to access only a part of the database. The view level of abstraction exists to simplify their interaction with the system. The system may provide many views for the same database.


Layered architecture

The ability to provide a
design A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
of different levels of abstraction can * simplify the design considerably * enable different role players to effectively work at various levels of abstraction * support the portability of
software artifact An artifact is one of many kinds of ''tangible'' by-products produced during the development of software. Some artifacts (e.g., use cases, class diagrams, and other Unified Modeling Language (UML) models, requirements and design documents) help de ...
s (model-based ideally)
Systems design Systems design interfaces, and data for an electronic control system to satisfy specified requirements. System design could be seen as the application of system theory to product development. There is some overlap with the disciplines of system ...
and
business process design Business process modeling (BPM) in business process management and systems engineering is the activity of representing processes of an enterprise, so that the current business processes may be analyzed, improved, and automated. BPM is typically ...
can both use this. Some design processes specifically generate designs that contain various levels of abstraction. Layered architecture partitions the concerns of the application into stacked groups (layers). It is a technique used in designing computer software, hardware, and communications in which system or network components are isolated in layers so that changes can be made in one layer without affecting the others.


See also

*
Abstraction principle (computer programming) In software engineering and programming language theory, the abstraction principle (or the principle of abstraction) is a basic dictum that aims to reduce duplication of information in a program (usually with emphasis on code duplication) whenever ...
*
Abstraction inversion In computer programming, abstraction inversion is an anti-pattern arising when users of a construct need functions implemented within it but not exposed by its interface. The result is that the users re-implement the required functions in terms of ...
for an anti-pattern of one danger in abstraction *
Abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
for an abstract description of a set of data *
Algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
for an abstract description of a computational procedure *
Bracket abstraction A bracket is either of two tall fore- or back-facing punctuation marks commonly used to isolate a segment of text or data from its surroundings. Typically deployed in symmetric pairs, an individual bracket may be identified as a 'left' or 'r ...
for making a term into a function of a variable * Data modeling for structuring data independent of the processes that use it * Encapsulation for abstractions that hide implementation details * Greenspun's Tenth Rule for an aphorism about an (the?) optimum point in the space of abstractions * Higher-order function for abstraction where functions produce or consume other functions *
Lambda abstraction Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation tha ...
for making a term into a function of some variable *
List of abstractions (computer science) This list contains abstractions used in computer programming. Notes References Abstraction Programming language concepts {{DEFAULTSORT:Abstractions computer science ...
*
Refinement Refinement may refer to: Mathematics * Equilibrium refinement, the identification of actualized equilibria in game theory * Refinement of an equivalence relation, in mathematics ** Refinement (topology), the refinement of an open cover in mathem ...
for the opposite of abstraction in computing *
Integer (computer science) In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
*
Heuristic (computer science) In mathematical optimization and computer science, heuristic (from Greek εὑρίσκω "I find, discover") is a technique designed for solving a problem more quickly when classic methods are too slow for finding an approximate solution, or whe ...


References


Further reading

* *
Abstraction/information hiding
– CS211 course, Cornell University. * * *


External links


SimArch
example of layered architecture for distributed simulation systems. {{DEFAULTSORT:Abstraction (computer science) Data management Articles with example Java code Abstraction Computer science Object-oriented programming