HOME

TheInfoList



OR:

S is a statistical
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
developed primarily by John Chambers and (in earlier versions) Rick Becker, Trevor Hastie, William Cleveland and Allan Wilks of
Bell Laboratories Nokia Bell Labs, commonly referred to as ''Bell Labs'', is an American industrial research and development company owned by Finnish technology company Nokia. With headquarters located in Murray Hill, New Jersey, the company operates several lab ...
. The aim of the language, as expressed by John Chambers, is "to turn ideas into software, quickly and faithfully". It was formerly widely used by academic researchers., but has now been superseded by the partially backwards compatible R language, a part of the GNU free software project.
S-PLUS S-PLUS is a commercial implementation of the S (programming language), S programming language sold by TIBCO Software Inc. It features object-oriented programming capabilities and advanced analytical algorithms. Its statistical analysis capabilit ...
was a widely used commercial implementation of S that was formerly sold by TIBCO Software.


History


"Old S"

S is one of several statistical computing languages that were designed at Bell Laboratories, and first took form between 1975–1976. Up to that time, much of the statistical computing was done by directly calling Fortran subroutines; however, S was designed to offer an alternate and more interactive approach, motivated in part by
exploratory data analysis In statistics, exploratory data analysis (EDA) is an approach of data analysis, analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or ...
advocated by
John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...
. Early design decisions that hold even today include interactive graphics devices (printers and character terminals at the time), and providing easily accessible documentation for the functions. Development of the project was led by John Chambers and Trevor Hastie, and included developers Richard Becker, Allan Wilks, John Chambers, and William Cleveland, all of whom were then employees of
AT&T AT&T Inc., an abbreviation for its predecessor's former name, the American Telephone and Telegraph Company, is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the w ...
. Out of the developers who contributed to S, Chambers is generally agreed to be the most significant contributor. Chambers received the Software System Award from the
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional membe ...
for his work on S. The first working version of S was built in 1976, and ran on the GCOS operating system. At this time, S was unnamed; naming suggestions included ''ISCS (Interactive SCS)'', ''SCS (Statistical Computing System)'', and ''SAS (Statistical Analysis System)'' (which was already taken: see
SAS System SAS (previously "Statistical Analysis System") is a statistical software suite developed by SAS Institute for data management, advanced analytics, multivariate analysis, business intelligence, and predictive analytics. SAS was developed at No ...
). The name 'S' (used with single quotation marks until 1979) was chosen, as it was a common letter in the suggestions and consistent with other programming languages designed from the same institution at the time (namely the
C programming language C (''pronounced'' '' – like the letter c'') is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of ...
). It stands for the word "statistics". When UNIX/32V was ported to the (then new) 32-bit DEC VAX, computing on the
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
platform became feasible for S. In late 1979, S2 was ported from GCOS to UNIX, which would become the new primary platform. In 1980 the first version of S was distributed outside Bell Laboratories and in 1981 source versions were made available. S was distributed freely in academic circles, and became popular among academic statisticians. The research team at Bell Laboratories published two books in 1984: ''S: An Interactive Environment for Data Analysis and Graphics'' (known as the 'Brown Book') and ''Extending the S System''. Also, in 1984 the source code for S became licensed through AT&T Software Sales for education and commercial purposes.


"New S"

The first version of
S-PLUS S-PLUS is a commercial implementation of the S (programming language), S programming language sold by TIBCO Software Inc. It features object-oriented programming capabilities and advanced analytical algorithms. Its statistical analysis capabilit ...
was released by Statistical Sciences, Inc. in 1988. S-PLUS was later sold to TIBCO Software. By this time, many changes were made to S and the syntax of the language with the release of S3. ''The New S Language'' (known as the 'Blue Book') was published to introduce the new features, such as the transition from ''macros'' to ''functions'' and how functions can be passed to other functions (such as apply). Many other changes to the S language were to extend the concept of "objects", and to make the syntax more consistent (and strict). However, many users found the transition to ''New S'' difficult, since their macros needed to be rewritten. Many other changes to S took hold, such as the use of X11 and
PostScript PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
graphics devices, rewriting many internal functions from Fortran to C, and the use of
double precision Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point arithmetic, floating-point computer number format, number format, usually occupying 64 Bit, bits in computer memory; it represents a wide range of numeri ...
(only) arithmetic. The ''New S'' language is very similar to that used in modern versions of
S-PLUS S-PLUS is a commercial implementation of the S (programming language), S programming language sold by TIBCO Software Inc. It features object-oriented programming capabilities and advanced analytical algorithms. Its statistical analysis capabilit ...
and R. The graphical user interface of S was also updated interactive graphical features after integration with
Axum Axum, also spelled Aksum (), is a town in the Tigray Region of Ethiopia with a population of 66,900 residents (as of 2015). It is the site of the historic capital of the Aksumite Empire. Axum is located in the Central Zone of the Tigray Re ...
. ''Statistical Models in S'' (known as the 'White Book') was published in 1991, introducing Wilkinson-Rogers formula notation (using the ~ operator) for defining statistical models, data frame objects, and modifications to the use of object methods and classes.


S4

The latest version of the S standard is S4, released in 1998. It provides advanced object-oriented features. S4 classes differ markedly from S3 classes; S4 formally defines the representation and inheritance for each class, and has
multiple dispatch Multiple dispatch or multimethods is a feature of some programming languages in which a Subroutine, function or Method (computer programming), method can be dynamic dispatch, dynamically dispatched based on the run time (program lifecycle phase), ...
: the generic function can be dispatched to a method based on the class of any number of arguments, not just one.


See also

*
R (programming language) R is a programming language for statistical computing and Data and information visualization, data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science. The core R language is ...
, derivative language based on S programming language that is partially
backward compatible In telecommunications and computing, backward compatibility (or backwards compatibility) is a property of an operating system, software, real-world product, or technology that allows for interoperability with an older legacy system, or with inpu ...
with S programs


References


External links

* '' Evolution of the S Language'', by John M. Chambers, discusses the new features in Version 4 of S (in
PostScript PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
format), {{DEFAULTSORT:S (Programming Language) Statistical programming languages Programming languages created in 1976