Low-level Programming
   HOME

TheInfoList



OR:

A low-level programming language is a
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
that provides little or no
abstraction Abstraction in its main sense is a conceptual process wherein general rules and concepts are derived from the usage and classification of specific examples, literal ("real" or "concrete") signifiers, first principles, or other methods. "An abstr ...
from a computer's
instruction set architecture In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
—commands or functions in the language map that are structurally similar to processor's instructions. Generally, this refers to either
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
or
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
. Because of the low (hence the word) abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture. Low-level languages can convert to machine code without a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
or interpreter
second-generation programming language The label of second-generation programming language (2GL) is a generational way to categorize assembly languages. The term was coined to provide a distinction from higher level machine independent third-generation programming languages (3GLs) (su ...
s use a simpler processor called an
assembler Assembler may refer to: Arts and media * Nobukazu Takemura, avant-garde electronic musician, stage name Assembler * Assemblers, a fictional race in the ''Star Wars'' universe * Assemblers, an alternative name of the superhero group Champions of A ...
—and the resulting code runs directly on the processor. A program written in a low-level language can be made to run very quickly, with a small
memory footprint Memory footprint refers to the amount of main memory that a program uses or references while running. The word footprint generally refers to the extent of physical dimensions that an object occupies, giving a sense of its size. In computing, the ...
. An equivalent program in a
high-level language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to us ...
can be less efficient and use more memory. Low-level languages are simple, but considered difficult to use, due to numerous technical details that the programmer must remember. By comparison, a
high-level programming language In computer science, a high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ...
isolates execution semantics of a computer architecture from the specification of the program, which simplifies development.


Machine code

Machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
is the only language a computer can process directly without a previous transformation. Currently, programmers almost never write programs directly in machine code, because it requires attention to numerous details that a high-level language handles automatically. Furthermore, it requires memorizing or looking up numerical codes for every instruction, and is extremely difficult to modify. True ''machine code'' is a stream of raw, usually
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that t ...
, data. A programmer coding in "machine code" normally codes instructions and data in a more readable form such as
decimal The decimal numeral system (also called the base-ten positional numeral system and denary or decanary) is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numeral ...
,
octal The octal numeral system, or oct for short, is the radix, base-8 number system, and uses the Numerical digit, digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, ...
, or
hexadecimal In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexa ...
which is translated to internal format by a program called a
loader Loader can refer to: * Loader (equipment) * Loader (computing) ** LOADER.EXE, an auto-start program loader optionally used in the startup process of Microsoft Windows ME * Loader (surname) * Fast loader * Speedloader * Boot loader ** LOADER.COM ...
or toggled into the computer's memory from a
front panel A front panel was used on early electronic computers to display and allow the alteration of the state of the machine's internal registers and memory. The front panel usually consisted of arrays of indicator lamps, digit and symbol displays, to ...
. Although few programs are written in machine languages, programmers often become adept at reading it through working with
core dump In computing, a core dump, memory dump, crash dump, storage dump, system dump, or ABEND dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminat ...
s or debugging from the front panel. Example: A function in hexadecimal representation of 32-bit
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
machine code to calculate the ''n''th
Fibonacci number In mathematics, the Fibonacci numbers, commonly denoted , form a sequence, the Fibonacci sequence, in which each number is the sum of the two preceding ones. The sequence commonly starts from 0 and 1, although some authors start the sequence from ...
: 8B542408 83FA0077 06B80000 0000C383 FA027706 B8010000 00C353BB 01000000 B9010000 008D0419 83FA0376 078BD989 C14AEBF1 5BC3


Assembly language

Second-generation languages provide one abstraction level on top of the machine code. In the early days of coding on computers like
TX-0 The TX-0, for ''Transistorized Experimental computer zero'', but affectionately referred to as tixo (pronounced "tix oh"), was an early fully transistorized computer and contained a then-huge 64 K of 18-bit words of magnetic-core memory. Constru ...
and
PDP-1 The PDP-1 (''Programmed Data Processor-1'') is the first computer in Digital Equipment Corporation's PDP series and was first produced in 1959. It is famous for being the computer most important in the creation of hacker culture at Massachusetts ...
, the first thing
MIT The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the m ...
hackers A hacker is a person skilled in information technology who uses their technical knowledge to achieve a goal or overcome an obstacle, within a computerized system by non-standard means. Though the term ''hacker'' has become associated in popu ...
did was to write assemblers.
Assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
has little
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy Philosophy (f ...
or formal specification, being only a mapping of human-readable symbols, including symbolic addresses, to
opcode In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the operat ...
s, addresses, numeric constants,
strings String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
and so on. Typically, one
machine instruction In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
is represented as one line of assembly code. Assemblers produce
object file An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the ...
s that can link with other object files or be loaded on their own. Most assemblers provide macros to generate common sequences of instructions. Example: The same
Fibonacci number In mathematics, the Fibonacci numbers, commonly denoted , form a sequence, the Fibonacci sequence, in which each number is the sum of the two preceding ones. The sequence commonly starts from 0 and 1, although some authors start the sequence from ...
calculator as above, but in x86-64 assembly language using
AT&T syntax x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for t ...
: _fib: movl $1, %eax xorl %ebx, %ebx .fib_loop: cmpl $1, %edi jbe .fib_done movl %eax, %ecx addl %ebx, %eax movl %ecx, %ebx subl $1, %edi jmp .fib_loop .fib_done: ret In this code example, hardware features of the x86-64 processor (its registers) are named and manipulated directly. The function loads its input from ''%edi'' in accordance to the System V ABI and performs its calculation by manipulating values in the EAX, EBX, and ECX registers until it has finished and returns. Note that in this assembly language, there is no concept of returning a value. The result having been stored in the EAX register, the RET command simply moves code processing to the code location stored on the stack (usually the instruction immediately after the one that called this function) and it is up to the author of the calling code to know that this function stores its result in EAX and to retrieve it from there. x86-64 assembly language imposes no standard for returning values from a function (and in fact, has no concept of a function); it is up to the calling code to examine state after the procedure returns if it needs to extract a value. Compare this with the same function in C, a
high-level language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to us ...
: unsigned int fib(unsigned int n) This code is very similar in structure to the assembly language example but there are significant differences in terms of abstraction: * The input (parameter n) is an abstraction that does not specify any storage location on the hardware. In practice, the C compiler follows one of many possible
calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been ...
s to determine a storage location for the input. * The assembly language version loads the input parameter from the stack into a register and in each iteration of the loop decrements the value in the register, never altering the value in the memory location on the stack. The C compiler could load the parameter into a register and do the same or could update the value wherever it is stored. Which one it chooses is an implementation decision completely hidden from the code author (and one with no
side effects In medicine, a side effect is an effect, whether therapeutic or adverse, that is secondary to the one intended; although the term is predominantly employed to describe adverse effects, it can also apply to beneficial, but unintended, consequence ...
, thanks to C language standards). * The local variables a, b and c are abstractions that do not specify any specific storage location on the hardware. The C compiler decides how to actually store them for the target architecture. * The return function specifies the value to return, but does not dictate ''how'' it is returned. The C compiler for any specific architecture implements a standard mechanism for returning the value. Compilers for the x86 architecture typically (but not always) use the EAX register to return a value, as in the assembly language example (the author of the assembly language example has ''chosen'' to copy the C convention but assembly language does not require this). These abstractions make the C code compilable without modification on any architecture for which a C compiler has been written. The x86 assembly language code is specific to the x86 architecture.


Low-level programming in high-level languages

During the late 1960s, high-level languages such as
PL/S PL/S, short for Programming Language/Systems, is a "machine-oriented" programming language based on PL/I. It was developed by IBM in the late 1960s, under the name Basic Systems Language (BSL), as a replacement for assembly language on internal ...
,
BLISS BLISS is a system programming language developed at Carnegie Mellon University (CMU) by W. A. Wulf, D. B. Russell, and A. N. Habermann around 1970. It was perhaps the best known system language until C debuted a few years later. Since then, C b ...
,
BCPL BCPL ("Basic Combined Programming Language") is a procedural, imperative, and structured programming language. Originally intended for writing compilers for other languages, BCPL is no longer in common use. However, its influence is still ...
, extended
ALGOL ALGOL (; short for "Algorithmic Language") is a family of imperative computer programming languages originally developed in 1958. ALGOL heavily influenced many other languages and was the standard method for algorithm description used by the ...
(for
Burroughs large systems The Burroughs Large Systems Group produced a family of large 48-bit mainframes using stack machine instruction sets with dense syllables.E.g., 12-bit syllables for B5000, 8-bit syllables for B6500 The first machine in the family was the B5000 in ...
) and C included some degree of access to low-level programming functions. One method for this is
Inline assembly In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C ...
, in which assembly code is embedded in a high-level language that supports this feature. Some of these languages also allow architecture-dependent compiler optimization directives to adjust the way a compiler uses the target processor architecture.


References

{{X86 assembly topics Programming language classification Articles with example C code