HOME

TheInfoList



OR:

In
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as anal ...
, self-hosting is the use of a
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Programm ...
as part of the
toolchain In software, a toolchain is a set of programming tools that is used to perform a complex software development task or to create a software product, which is typically another computer program or a set of related programs. In general, the tools for ...
or
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
that produces new versions of that same program—for example, a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
that can compile its own
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
. Self-hosting
software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consist ...
is commonplace on
personal computer A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or te ...
s and larger systems. Other programs that are typically self-hosting include
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
s,
assemblers Assembler may refer to: Arts and media * Nobukazu Takemura, avant-garde electronic musician, stage name Assembler * Assemblers, a fictional race in the ''Star Wars'' universe * Assemblers, an alternative name of the superhero group Champions of A ...
,
command-line interpreter A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
s and
revision control software In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections ...
.


Operating systems

An operating system is self-hosted when the toolchain to build the operating system runs on that OS. For example, Windows can be built on a computer running Windows. However, operating systems don't start out as self-hosted. When developing for a new computer or operating system, you need a system to run the development software, but you need development software to write the operating system. This is called a
bootstrapping In general, bootstrapping usually refers to a self-starting process that is supposed to continue or grow without external input. Etymology Tall boots may have a tab, loop or handle at the top known as a bootstrap, allowing one to use fingers ...
problem or, more generically, a
chicken or the egg The chicken or the egg causality dilemma is commonly stated as the question, "which came first: the chicken or the egg?" The dilemma stems from the observation that all chickens hatch from eggs and all chicken eggs are laid by chickens. "Chick ...
dilemma. A solution to this problem is the
cross compiler A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running. For example, a compiler that runs on a PC but generates code that runs on an Android smartphone is a cross c ...
(or cross assembler when working with assembly language). A cross compiler allows you to use one computer to develop software for a different machine or operating system, making it possible to create an operating system for a machine for which one does not yet exist. Once written, software can be deployed to the target system using the usual means: an
EPROM An EPROM (rarely EROM), or erasable programmable read-only memory, is a type of programmable read-only memory (PROM) chip that retains its data when its power supply is switched off. Computer memory that can retrieve stored data after a power s ...
,
floppy disk A floppy disk or floppy diskette (casually referred to as a floppy, or a diskette) is an obsolescent type of disk storage composed of a thin and flexible disk of a magnetic storage medium in a square or nearly square plastic enclosure lined ...
ette,
flash memory Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for the NOR and NAND logic gates. Both use ...
(such as a USB thumb drive), or
JTAG JTAG (named after the Joint Test Action Group which codified it) is an industry standard for verifying designs and testing printed circuit boards after manufacture. JTAG implements standards for on-chip instrumentation in electronic design aut ...
device. This is similar to the method used to write software for gaming consoles or for handheld devices, like cellular phones or tablets, which do not host their own development tools. Once the software is mature enough to compile its own code, the cross-development dependency ends. At this point, an operating system is said to be self-hosted.


Compilers

Software development using compiler or interpreters can also be self hosted when the compiler is capable of compiling itself. Since self-hosted compilers suffer from the same bootstrap problems as operating systems, a compiler for a new programming language needs to be written in an existing language. So the developer may use something like assembly language, C/C++, or even a scripting language like Python or Lua to build the first version of the compiler. Once the language is mature enough, development of the compiler can shift to the compiler's native language, allowing the compiler to build itself.


History

The first self-hosting compiler (excluding assemblers) was written for
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lispin ...
by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp Interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting. This technique is usually only practicable when an interpreter already exists for the very same language that is to be compiled; though possible, it is extremely uncommon to humanly compile a compiler with itself. The concept borrows directly from and is an example of the broader notion of running a program on itself as input, used also in various proofs in
theoretical computer science computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumscribe the ...
, such as the proof that the
halting problem In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. Alan Turing proved in 1936 that a ...
is undecidable.


Examples

Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
started development on
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
in 1968 by writing and compiling programs on the GE-635 and carrying them over to the PDP-7 for testing. After the initial Unix kernel, a command interpreter, an editor, an assembler, and a few utilities were completed, the Unix operating system was self-hosting - programs could be written and tested on the PDP-7 itself.Dennis M. Ritchie
"The Development of the C Language"
1993.
Douglas McIlroy Malcolm Douglas McIlroy (born 1932) is a mathematician, engineer, and programmer. As of 2019 he is an Adjunct Professor of Computer Science at Dartmouth College. McIlroy is best known for having originally proposed Unix pipelines and developed se ...
wrote TMG (a
compiler-compiler In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine. The most common type of compiler ...
) in TMG on a piece of paper and "decided to give his piece of paper to his piece of paper," doing the computation himself, thus compiling a TMG compiler into assembly, which he typed up and assembled on Ken Thompson's PDP-7. Development of the
GNU GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
system relies largely on GCC (the GNU C
Compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
) and GNU
Emacs Emacs , originally named EMACS (an acronym for "Editor MACroS"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, ...
(a popular editor), making possible the self contained, maintained and sustained development of
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, n ...
for the
GNU Project The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collabor ...
. Many
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s have self-hosted implementations: compilers that are both in and for the same language. In some of these cases, the initial implementation was developed using
bootstrapping In general, bootstrapping usually refers to a self-starting process that is supposed to continue or grow without external input. Etymology Tall boots may have a tab, loop or handle at the top known as a bootstrap, allowing one to use fingers ...
, i.e. using another high-level language, assembler, or even
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
.


List of languages having self-hosting compilers

The following programming languages have self-hosting compilers:


See also

*
Bootstrapping (compilers) In computer science, bootstrapping is the technique for producing a self-compiling compiler – that is, a compiler (or assembler) written in the source programming language that it intends to compile. An initial core version of the compiler (th ...
*
Compiler-compiler In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine. The most common type of compiler ...
*
Cross-compiler A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running. For example, a compiler that runs on a PC but generates code that runs on an Android smartphone is a cross ...
*
Dogfooding Eating your own dog food or "dogfooding" is the practice of using one's own products or services. This can be a way for an organization to test its products in real-world usage using product management techniques. Hence dogfooding can act as qual ...
* Futamura projection * Self-interpreter *
Self-reference Self-reference occurs in natural or formal languages when a sentence, idea or formula refers to itself. The reference may be expressed either directly—through some intermediate sentence or formula—or by means of some encoding. In philoso ...
* Indirect self-modification


References

{{Reflist Computer programming Self-hosting software