HOME

TheInfoList



OR:

The Compiler Language With No Pronounceable Acronym (INTERCAL) is an
esoteric programming language An esoteric programming language (sometimes shortened to esolang) is a programming language designed to test the boundaries of computer programming language design, as a proof of concept, as software art, as a hacking interface to another language ...
that was created as a
parody A parody, also known as a spoof, a satire, a send-up, a take-off, a lampoon, a play on (something), or a caricature, is a creative work designed to imitate, comment on, and/or mock its subject by means of satiric or ironic imitation. Often its sub ...
by
Don Woods Donald Woods (1933–2001) was a South African journalist and activist. Donald or Don Woods may also refer to: * Donald Woods (actor) (1906–1998), Canadian-born American film and television actor * Donald Devereux Woods (1912–1964), British m ...
and , two
Princeton University Princeton University is a private research university in Princeton, New Jersey. Founded in 1746 in Elizabeth as the College of New Jersey, Princeton is the fourth-oldest institution of higher education in the United States and one of the ...
students, in 1972. It satirizes aspects of the various programming languages at the time, as well as the proliferation of proposed language constructs and notations in the 1960s. There are two maintained implementations of INTERCAL dialects: C-INTERCAL (created in 1990), maintained by
Eric S. Raymond Eric Steven Raymond (born December 4, 1957), often referred to as ESR, is an American software developer, open-source software advocate, and author of the 1997 essay and 1999 book ''The Cathedral and the Bazaar''. He wrote a guidebook for the ...
and Alex Smith, and CLC-INTERCAL, maintained by Claudio Calvelli.


History

According to the original manual by the authors, The original Princeton implementation used
punched card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s and the
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding ...
character set. To allow INTERCAL to run on computers using
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because o ...
, substitutions for two characters had to be made: $ substituted for ¢ as the ''mingle'' operator, "represent ngthe increasing cost of software in relation to hardware", and ? was substituted for as the unary
exclusive-or Exclusive or or exclusive disjunction is a logical operation that is true if and only if its arguments differ (one is true, the other is false). It is symbolized by the prefix operator J and by the infix operators XOR ( or ), EOR, EXOR, , , ...
operator to "correctly express the average person's reaction on first encountering exclusive-or". In recent versions of C-INTERCAL, the older operators are supported as alternatives; INTERCAL programs may now be encoded in
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because o ...
, Latin-1, or
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of e ...
.


Version numbers

C-INTERCAL swaps the major and minor version numbers, compared to tradition. The HISTORY file shows releases starting at version 0.3 and having progressed to 0.31, but containing 1.26 between 0.26 and 0.27. CLC-INTERCAL version numbering scheme was traditional until version 0.06, when it changed to the scheme documented in the
README In software development, a README file contains information about the other files in a directory or archive of computer software. A form of documentation, it is usually a simple plain text file called README, Read Me, READ.ME, README.TXT, REA ...
file, which says:
* The term "version" has been replaced by "perversion" for correctness * The perversion number consists of a floating-point number with independent signs for the integer and fractional part. Negative fractions indicate pre-escapes (so 1.-94 means "94 pre-escapes to go before 1.00". Or you can just add the numbers together and get 0.06, which is entirely a coincidence since 0.06 is not being developed) * The fractional part of a perversion number can be integer or floating point, with a similar meaning for the parts. The current pre-escape is 1.-94.-2 which means "2 pre-pre-escapes to go before pre-escape 1.-94".


Details

INTERCAL was intended to be completely different from all other computer languages. Common operations in other languages have cryptic and redundant syntax in INTERCAL. From the INTERCAL Reference Manual: INTERCAL has many other features designed to make it even more aesthetically unpleasing to the programmer: it uses statements such as "READ OUT", "IGNORE", "FORGET", and modifiers such as "PLEASE". This last keyword provides two reasons for the program's rejection by the compiler: if "PLEASE" does not appear often enough, the program is considered insufficiently polite, and the error message says this; if it appears too often, the program could be rejected as excessively polite. Although this feature existed in the original INTERCAL compiler, it was undocumented. Despite the language's intentionally obtuse and wordy syntax, INTERCAL is nevertheless
Turing-complete In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be used to simulate any Tur ...
: given enough memory, INTERCAL can solve any problem that a
Universal Turing machine In computer science, a universal Turing machine (UTM) is a Turing machine that can simulate an arbitrary Turing machine on arbitrary input. The universal machine essentially achieves this by reading both the description of the machine to be simu ...
can solve. Most implementations of INTERCAL do this very slowly, however. A
Sieve of Eratosthenes In mathematics, the sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to any given limit. It does so by iteratively marking as composite (i.e., not prime) the multiples of each prime, starting with the first prime ...
benchmark, computing all prime numbers less than 65536, was tested on a
Sun The Sun is the star at the center of the Solar System. It is a nearly perfect ball of hot plasma, heated to incandescence by nuclear fusion reactions in its core. The Sun radiates this energy mainly as light, ultraviolet, and infrared radi ...
SPARCstation 1 The SPARCstation 1 (Sun 4/60, code-named ''Campus'') is the first of the SPARCstation series of SPARC-based computer workstations sold by Sun Microsystems. The design originated in 1987 by a Sun spin-off company, Unisun, which was soon re-acquired. ...
in 1992. In C, it took less than half a second; the same program in INTERCAL took over seventeen hours.


Documentation

The INTERCAL Reference Manual contains many paradoxical, nonsensical, or otherwise humorous instructions: The manual also contains a "
tonsil The tonsils are a set of lymphoid organs facing into the aerodigestive tract, which is known as Waldeyer's tonsillar ring and consists of the adenoid tonsil, two tubal tonsils, two palatine tonsils, and the lingual tonsils. These organs play a ...
", as explained in this footnote: "4) Since all other reference manuals have appendices, it was decided that the INTERCAL manual should contain some other type of removable organ." The INTERCAL manual gives unusual names to all non-alphanumeric
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because o ...
characters: single and double quotes are "sparks" and "rabbit ears" respectively. (The exception is the
ampersand The ampersand, also known as the and sign, is the logogram , representing the conjunction "and". It originated as a ligature of the letters ''et''—Latin for "and". Etymology Traditionally in English, when spelling aloud, any letter that ...
: as the
Jargon File The Jargon File is a glossary and usage dictionary of slang used by computer programmers. The original Jargon File was a collection of terms from technical cultures such as the MIT AI Lab, the Stanford AI Lab (SAIL) and others of the old ARPANET ...
states, "what could be sillier?") The assignment operator, represented as an equals sign (INTERCAL's "half mesh") in many other programming languages, is in INTERCAL a left-arrow, <-, made up of an "angle" and a "worm", obviously read as "gets".


Syntax

Input (using the WRITE IN instruction) and output (using the READ OUT instruction) do not use the usual formats; in INTERCAL-72, WRITE IN inputs a number written out as digits in English (such as SIX FIVE FIVE THREE FIVE), and READ OUT outputs it in "butchered"
Roman numerals Roman numerals are a numeral system that originated in ancient Rome and remained the usual way of writing numbers throughout Europe well into the Late Middle Ages. Numbers are written with combinations of letters from the Latin alphabet, ...
. More recent versions have their own I/O systems. Comments can be achieved by using the inverted statement identifiers involving NOT or N'T; these cause lines to be initially ABSTAINed so that they have no effect. (A line can be ABSTAINed from even if it doesn't have valid syntax; syntax errors happen at runtime, and only then when the line is un-ABSTAINed.)


Data structures

INTERCAL-72 (the original version of INTERCAL) had only four
data type In computer science and computer programming, a data type (or simply type) is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most prog ...
s: the 16-
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented ...
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
(represented with a ., called a "spot"), the 32-bit integer (:, a "twospot"), the
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of 16-bit integers (,, a "tail"), and the array of 32-bit integers (;, a "hybrid"). There are 65535 available variables of each type, numbered from .1 to .65535 for 16-bit integers, for instance. However, each of these variables has its own stack on which it can be pushed and popped (STASHed and RETRIEVEd, in INTERCAL terminology), increasing the possible complexity of data structures. More modern versions of INTERCAL have by and large kept the same data structures, with appropriate modifications; TriINTERCAL, which modifies the
radix In a positional numeral system, the radix or base is the number of unique digits, including the digit zero, used to represent numbers. For example, for the decimal/denary system (the most common system in use today) the radix (base number) is t ...
with which numbers are represented, can use a 10- trit type rather than a 16-bit type, and CLC-INTERCAL implements many of its own data structures, such as "classes and lectures", by making the basic data types store more information rather than adding new types. Arrays are dimensioned by assigning to them as if they were a scalar variable. Constants can also be used, and are represented by a # ("mesh") followed by the constant itself, written as a
decimal The decimal numeral system (also called the base-ten positional numeral system and denary or decanary) is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numer ...
number; only integer constants from 0 to 65535 are supported.


Operators

There are only five operators in INTERCAL-72. Implementations vary in which characters represent which operation, and many accept more than one character, so more than one possibility is given for many of the operators. Contrary to most other languages, AND, OR, and XOR are unary operators, which work on consecutive bits of their argument; the
most significant bit In computing, bit numbering is the convention used to identify the bit positions in a binary number. Bit significance and indexing In computing, the least significant bit (LSB) is the bit position in a binary integer representing the binary 1 ...
of the result is the operator applied to the least significant and most significant bits of the input, the second-most-significant bit of the result is the operator applied to the most and second-most significant bits, the third-most-significant bit of the result is the operator applied to the second-most and third-most bits, and so on. The operator is placed between the punctuation mark specifying a variable name or constant and the number that specifies which variable it is, or just inside grouping marks (i.e. one character later than it would be in programming languages like C.) SELECT and INTERLEAVE (which is also known as MINGLE) are infix binary operators; SELECT takes the bits of its first operand that correspond to "1" bits of its second operand and removes the bits that correspond to "0" bits, shifting towards the least significant bit and padding with zeroes (so 51 (110011 in binary) SELECT 21 (10101 in binary) is 5 (101 in binary)); MINGLE alternates bits from its first and second operands (in such a way that the least significant bit of its second operand is the least significant bit of the result). There is no operator precedence; grouping marks must be used to disambiguate the precedence where it would otherwise be ambiguous (the grouping marks available are ' ("spark"), which matches another spark, and " ("rabbit ears"), which matches another rabbit ears; the programmer is responsible for using these in such a way that they make the expression unambiguous).


Control structures

INTERCAL statements all start with a "statement identifier"; in INTERCAL-72, this can be DO, PLEASE, or PLEASE DO, all of which mean the same to the program (but using one of these too heavily causes the program to be rejected, an
undocumented feature An undocumented feature is an unintended or undocumented hardware operation, for example an undocumented instruction, or software feature found in computer hardware and software that is considered beneficial or useful. Sometimes the documentation ...
in INTERCAL-72 that was mentioned in the C-INTERCAL manual), or an inverted form (with NOT or N'T appended to the identifier). Backtracking INTERCAL, a modern variant, also allows variants using MAYBE (possibly combined with PLEASE or DO) as a statement identifier, which introduces a choice-point. Before the identifier, an optional line number (an integer enclosed in parentheses) can be given; after the identifier, a percent chance of the line executing can be given in the format %50, which defaults to 100%. In INTERCAL-72, the main control structures are NEXT, RESUME, and FORGET. DO (''line'') NEXT branches to the line specified, remembering the next line that would be executed if it weren't for the NEXT on a call stack (other identifiers than DO can be used on any statement, DO is given as an example); DO FORGET ''expression'' removes ''expression'' entries from the top of the call stack (this is useful to avoid the error that otherwise happens when there are more than 80 entries), and DO RESUME ''expression'' removes ''expression'' entries from the call stack and jumps to the last line remembered. C-INTERCAL also provides the COME FROM instruction, written DO COME FROM (''line''); CLC-INTERCAL and the most recent C-INTERCAL versions also provide computed COME FROM (DO COME FROM ''expression'') and NEXT FROM, which is like COME FROM but also saves a return address on the NEXT STACK. Alternative ways to affect program flow, originally available in INTERCAL-72, are to use the IGNORE and REMEMBER instructions on variables (which cause writes to the variable to be silently ignored and to take effect again, so that instructions can be disabled by causing them to have no effect), and the ABSTAIN and REINSTATE instructions on lines or on types of statement, causing the lines to have no effect or to have an effect again respectively.


Hello, world

The traditional
"Hello, world!" program A "Hello, World!" program is generally a computer program that ignores any input and outputs or displays a message similar to "Hello, World!". A small piece of code in most general-purpose programming languages, this program is used to illustra ...
demonstrates how different INTERCAL is from standard programming languages. In C, it could read as follows: #include int main(void) The equivalent program in C-INTERCAL is longer and harder to read: DO ,1 <- #13 PLEASE DO ,1 SUB #1 <- #238 DO ,1 SUB #2 <- #108 DO ,1 SUB #3 <- #112 DO ,1 SUB #4 <- #0 DO ,1 SUB #5 <- #64 DO ,1 SUB #6 <- #194 DO ,1 SUB #7 <- #48 PLEASE DO ,1 SUB #8 <- #22 DO ,1 SUB #9 <- #248 DO ,1 SUB #10 <- #168 DO ,1 SUB #11 <- #24 DO ,1 SUB #12 <- #16 DO ,1 SUB #13 <- #162 PLEASE READ OUT ,1 PLEASE GIVE UP


Dialects

The original Woods–Lyon INTERCAL was very limited in its
input/output In computing, input/output (I/O, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signal ...
capabilities: the only acceptable input were numbers with the digits spelled out, and the only output was an extended version of
Roman numerals Roman numerals are a numeral system that originated in ancient Rome and remained the usual way of writing numbers throughout Europe well into the Late Middle Ages. Numbers are written with combinations of letters from the Latin alphabet, ...
. The C-INTERCAL reimplementation, being available on the Internet, has made the language more popular with devotees of esoteric programming languages. The C-INTERCAL dialect has a few differences from original INTERCAL and introduced a few new features, such as a COME FROM statement and a means of doing text I/O based on the Turing Text Model. The authors of C-INTERCAL also created the TriINTERCAL variant, based on the
Ternary numeral system A ternary numeral system (also called base 3 or trinary) has three as its base. Analogous to a bit, a ternary digit is a trit (trinary digit). One trit is equivalent to log2 3 (about 1.58496) bits of information. Although ''ternary'' ...
and generalizing INTERCAL's set of operators. A more recent variant is Threaded Intercal, which extends the functionality of COME FROM to support multithreading. CLC-INTERCAL has a library called INTERNET for networking functionality including being an INTERCAL server, and also includes features such as Quantum Intercal, which enables multi-value calculations in a way purportedly ready for the first quantum computers. In early 2017 a .NET Implementation targeting the
.NET Framework The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
appeared on
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, conti ...
. This implementation supports the creation of standalone binary libraries and interop with other programming languages.


Impact and discussion

In the article "A Box, Darkly: Obfuscation, Weird Languages, and Code Aesthetics", INTERCAL is described under the heading "Abandon all sanity, ye who enter here: INTERCAL". The compiler and commenting strategy are among the "weird" features described: In "Technomasochism", Lev Bratishenko characterizes the INTERCAL compiler as a dominatrix:


Popular culture

The Nitrome Enjoyment System, a fictional video game console created by British indie game developer Nitrome, has games which are programmed in INTERCAL.


References


External links

{{commons category
Official website of C-INTERCAL

INTERCAL Resources on the Web
including several implementations
Computerworld Interview with Don Woods on INTERCAL


Esoteric programming languages Programming languages created in 1972 Parodies Computer humor