Array Bounds Checking
   HOME

TheInfoList



OR:

In
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as ana ...
, bounds checking is any method of detecting whether a
variable Variable may refer to: * Variable (computer science), a symbolic name associated with a value and whose associated value may be changed * Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many ...
is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as an
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
index is within the bounds of the array (index checking). A failed bounds check usually results in the generation of some sort of exception signal. Because performing bounds checking during every usage is time-consuming, it is not always done.
Bounds-checking elimination In computer science, bounds-checking elimination is a compiler optimization useful in programming languages or runtime systems that enforce bounds checking, the practice of checking every index into an Array data structure, array to verify that the ...
is a
compiler optimization In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power con ...
technique that eliminates unneeded bounds checking.


Range checking

A range check is a check to make sure a number is within a certain range; for example, to ensure that a value about to be assigned to a 16-bit integer is within the capacity of a 16-bit integer (i.e. checking against wrap-around). This is not quite the same as
type checking In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progra ...
. Other range checks may be more restrictive; for example, a variable to hold the number of a calendar month may be declared to accept only the range 1 to 12.


Index checking

Index checking means that, in all expressions indexing an array, the index value is checked against the bounds of the array (which were established when the array was defined), and if the index is out-of-bounds, further execution is suspended via some sort of error. Because reading or especially writing a value outside the bounds of an array may cause the program to malfunction or crash or enable security vulnerabilities (see
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memory ...
), index checking is a part of many high-level languages. Early compiled
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s with index checking ability included
ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a k ...
,
ALGOL 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously de ...
and
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Fren ...
, as well as interpreted programming languages such as
BASIC BASIC (Beginners' All-purpose Symbolic Instruction Code) is a family of general-purpose, high-level programming languages designed for ease of use. The original version was created by John G. Kemeny and Thomas E. Kurtz at Dartmouth College ...
. Many programming languages, such as C, never perform automatic bounds checking to raise speed. However, this leaves many
off-by-one error An off-by-one error or off-by-one bug (known by acronyms OBOE, OBO, OB1 and OBOB) is a logic error involving the discrete equivalent of a boundary condition. It often occurs in computer programming when an iterative loop iterates one time too m ...
s and
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memory ...
s uncaught. Many programmers believe these languages sacrifice too much for rapid execution. In his 1980
Turing Award The ACM A. M. Turing Award is an annual prize given by the Association for Computing Machinery (ACM) for contributions of lasting and major technical importance to computer science. It is generally recognized as the highest distinction in compu ...
lecture,
C. A. R. Hoare Sir Charles Antony Richard Hoare (Tony Hoare or C. A. R. Hoare) (born 11 January 1934) is a British computer scientist who has made foundational contributions to programming languages, algorithms, operating systems, formal verification, and c ...
described his experience in the design of
ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a k ...
, a language that included bounds checking, saying:
A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interest of efficiency on production runs. Unanimously, they urged us not to—they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.
Mainstream languages that enforce run time checking include
Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, Tur ...
, C#,
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lang ...
,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
,
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
,
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
,
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group ...
,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
,
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
,
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH ...
, and
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to: * Visual Basic .NET (now simply referred to as "Visual Basic"), the current version of Visual Basic launched in 2002 which runs on .NET * Visual Basic (cl ...
. The D and
OCaml OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
languages have run time bounds checking that is enabled or disabled with a compiler switch. In
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
run time checking is not part of the language, but part of the
STL STL may refer to: Communications * Standard telegraph level *Studio/transmitter link International law *Special Tribunal for Lebanon The Special Tribunal for Lebanon (STL), also referred to as the Lebanon Tribunal or the Hariri Tribunal, is a ...
and is enabled with a compiler switch (_GLIBCXX_DEBUG=1 or _LIBCPP_DEBUG=1). C# also supports ''unsafe regions'': sections of code that (among other things) temporarily suspend bounds checking to raise efficiency. These are useful for speeding up small time-critical bottlenecks without sacrificing the safety of a whole program. The
JS++ JS++ is a proprietary programming language for web development that extends JavaScript with a sound type system. It includes imperative, object-oriented, functional, and generic programming features. History JS++ first appeared on October 8, ...
programming language is able to analyze if an array index or map key is out-of-bounds at compile time using existent types, which is a nominal type describing whether the index or key is within-bounds or out-of-bounds and guides code generation. Existent types have been shown to add only 1ms overhead to compile times.


Hardware bounds checking

The safety added by bounds checking necessarily costs CPU time if the checking is performed in software; however, if the checks could be performed by hardware, then the safety can be provided "for free" with no runtime cost. An early system with hardware bounds checking was the
ICL 2900 Series The ICL 2900 Series was a range of mainframe computer, mainframe computer systems announced by the British manufacturer International Computers Limited, ICL on 9 October 1974. The company had started development under the name "New Range" immedi ...
mainframe announced in 1974. The
VAX VAX (an acronym for Virtual Address eXtension) is a series of computers featuring a 32-bit instruction set architecture (ISA) and virtual memory that was developed and sold by Digital Equipment Corporation (DEC) in the late 20th century. The VA ...
computer has an INDEX assembly instruction for array index checking which takes six operands, all of which can use any VAX addressing mode. The B6500 and similar Burroughs computers performed bound checking via hardware, irrespective of which computer language had been compiled to produce the machine code. A limited number of later
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
s have specialised instructions for checking bounds, e.g., the CHK2 instruction on the
Motorola 68000 The Motorola 68000 (sometimes shortened to Motorola 68k or m68k and usually pronounced "sixty-eight-thousand") is a 16/32-bit complex instruction set computer (CISC) microprocessor, introduced in 1979 by Motorola Semiconductor Products Sector ...
series. Research has been underway since at least 2005 regarding methods to use x86's built-in virtual memory management unit to ensure safety of array and buffer accesses. In 2015 Intel provided their
Intel MPX Intel MPX (Memory Protection Extensions) was a set of extensions to the x86 instruction set architecture. With compiler, runtime library and operating system support, Intel MPX claimed to enhance security to software by checking pointer references ...
extensions in their
Skylake Skylake or Sky Lake may refer to: * Skylake (microarchitecture), the codename for a processor microarchitecture developed by Intel as the successor to Broadwell * Skylake (Mysia), a town of ancient Mysia, now in Turkey * Sky Lake, Florida Sky La ...
processor architecture which stores bounds in a CPU register and table in memory. As of early 2017 at least GCC supports MPX extensions.


See also

*
Dynamic code analysis Dynamic program analysis is the analysis of computer software that is performed by executing programs on a real or virtual processor. For dynamic program analysis to be effective, the target program must be executed with sufficient test inputs ...
*
Runtime error detection Runtime error detection is a software verification method that analyzes a software application as it executes and reports defects that are detected during that execution. It can be applied during unit testing, component testing, integration tes ...
*
Static code analysis In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution. The term i ...


References


External links

*
On The Advantages Of Tagged Architecture
, IEEE Transactions On Computers, Volume C-22, Number 7, July, 1973. *
The Emperor’s Old Clothes
”, The 1980 ACM Turing Award Lecture, CACM volume 24 number 2, February 1981, pp 75–83. *
Bcc: Runtime checking for C programs
, Samuel C. Kendall, Proceedings of the USENIX Summer 1983 Conference. *
Bounds Checking for C
, Richard Jones and Paul Kelly, Imperial College, July 1995. *
ClearPath Enterprise Servers MCP Security Overview
, Unisys, April 2006. *
Secure Virtual Architecture: A Safe Execution Environment for Commodity Operating Systems
, John Criswell, Andrew Lenharth, Dinakar Dhurjati, Vikram Adve, SOSP'07 21st ACM Symposium on Operating Systems Principles, 2007. *

, Yutaka Oiwa. Implementation of the Memory-safe Full ANSI-C Compiler. ACM SIGPLAN Conference on Programing Language Design and Implementations (PLDI2009), June 2009. *
address-sanitizer
, Timur Iskhodzhanov, Alexander Potapenko, Alexey Samsonov, Kostya Serebryany, Evgeniy Stepanov, Dmitriy Vyukov, LLVM Dev Meeting, November 18, 2011.
Safe C Library of Bounded APIs
*
Safe C API—Concise solution of buffer overflow, The OWASP Foundation, OWASP AppSec, Beijing 2011




{{DEFAULTSORT:Bounds Checking Computer errors Arrays